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GLYCOSIDASE ENZYMES 
BACKGROUND OF THE INVENTION 

1 . Field of the Inventions 

This invention relates to newly identified poljTiucleotides, polypeptides encoded by 
such polynucleotides, the use of such polynucleotides and polypeptides, as well as the 
production and isolation of such polynucleotides and polypeptides. More particularly, the 
polynucleotides and polypeptides of the present invendon has been putatively identified as 
glucosidases, a-galactosidases, P-galactosidases, B-mannosidases, B-mannanases, 
endoglucanases, and pullalanases, 

2 . Description of Related A rt 

The glycosidic bond of P-galactosides can be cleaved by different classes of 
enzymes: (i) phospho-p-galactosidases (EC3.2.1.S5) are specific for a phosphorylated 
substrate generated via phosphoenolpyruvate phosphoU-ansferase system (PTS)-dependent 
uptake; (ii) typical p-galactosidases (EC 3.2.1.23), represented by the Escherichia coli LacZ 
enzyme, which are relatively specific for p-galactosides; and (iii) p-glucosidases (EC 
3.2.1.21) such as the enzymes of Agrobacterium faecalis, Clostridium thermocellum, 
Pyrococcus furiosus or Sulfolobus solfataricus i;Day, A.G. and Withers, S.G., (1986) 
Purification and characterization of a p-glucosidase from Alcaligenes faecaiis. Can. J. 
Biochem. CelL Biol. 64, 914-922; Kengen, S.W.M., et al. (1993) Eur. J. Biochem., 213, 
305-312; Ait, N., Cruezet, N. and Cattaneo, J. (1982) Properties of p-glucosidase purified 
from Clostridium thermocellum, J. Gen. Microbiol. 128, 569-577; Grogan, D.W. (1991) 
Evidence that p-galactosidase of Sulfolobus solfataricus is only one of several activities of 
a thermostable p-D-glycodiase. Appl. Environ. Microbiol. 57, 1644-1649). Members of 
the latter group, although highly specific with respect to the P-anomeric configuration of 
the glycosidic linkage, often display a rather relaxed substrate specificity and hydrolyze P- 
glucosides as well as P-fucosides and p-galactosides. 

1 
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Generally, a-galactosidases are enzymes that catalyze the hydrolysis of galactose 
groups on a polysaccharide backbone or hydrolyze the cleavage of di- or oligosaccharides 
comprising galactose. 

Generally, 13-mannanases are enzymes that catalyze the hydrolysis of mannose 
groups internally on a polysaccharide backbone or hydrolyze the cleavage of di- or 
oligosaccaharides comprising mannose groups. B-mannosidases hydrolyze non-reducing, 
terminal mannose residues on a mannose-containing polysaccharide and the cleavage of di- 
or oligosaccaharides comprising mannose groups. 

Guar gum is a branched galactomannan polysaccharide composed of P-1,4 linked 
mannose backbone with a- 1,6 linked galactose side chains. The enzymes required for the 
degradation of guar are P-mannanase, P-mannosidase and a-galactosidase. p-mannanase 
hydrolyses the mannose backbone internally and P-mannosidase hydro lyses non-reducing, 
terminal mannose residues, a-galactosidase hydrolyses a-linked galactose groups. 

Galactomannan polysaccharides and the enz\TP.es that degrade them have a variet>- 
of applications. Guar is commonly used as a thickening agent in food and is utilized in 
hydraulic fracturing in oil and gas recovery. Consequently, galactomannanases are 
industrially relevant for the degradation and modification of guar. Furthermore, a need 
exists for thermostable galactomannases that are active in extreme conditions associated 
with drilling and well stimulation. 

There are other applications for these enzymes in various industries, such as in the 
beet sugar industr,-. 20-30% of the domestic U.S. sucrose consumption is sucrose from 
sugar beets. Raw beet sugar can contain a small amount of raffmose when the sugar beets 
are stored before processing and rotting begins to set in. Raffmose inhibits the 
crystallization of sucrose and also constitutesr a hidden quantity of sucrose. Thus, there is 
merit to eliminating raffmose from raw beet sugar. a-Galactosidase has also been used as 
a digestive aid to break down raffmose, stachyose, and verbascose in such foods as beans 

and other gassy foods. 

p-galactosidases which are active and stable at high temperatures appear to be 
superior enzymes for the production of lactose-free dietary milk products (Chaplin, M.F. 
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and Bucke, C. (1990) In: Enzyme Technology, pp. 159-160, Cambridge University Press, 
Cambridge, UK). Also, several studies have demonstrated the applicability of P- 
galactosidases to the enzymatic synthesis of oligosaccharides via transglycosylation 
reactions (Nilsson, K.G.I. (1988) Enzymatic synthesis of oligosaccharides. Trends 
Biotechnol. 6, 156-264; Cote, G.L. and Tao, B.Y. (1990) Oligosaccharide synthesis by 
enzymatic transglycosylation. Glycoconjugate J. 7, 145-162). Despite the commercial 
potential, only a few p-galactosidases of thermophiles have been characterized so far. Two 
genes reported are P-galactoside-cleaving enzymes of the hyperthermophilic bacterium 
Thermotoga maritima, one of the most themiophilic organotrophic eubacteria described to 
date (Huber, R., Langworthy, T.A., Konig, H., Thomm, M., Woese, C.R., Sleytr, U.B. and 
Stetter, K.O. (1986) T. rnartima sp. nov. represents a new genus of unique extremely 
thermophilic eubacteria growing up to 90°C, Arch. Microbiol. 144, 324-333) one of the 
most thermophilic organotrophic eubacteria described to date. The gene products have been 
identified as a P-galactosidase and a p-glucosidase. 

Pullulanase is well known as a debranching enzyme of pullulan and starch. The 
enzyme hydrolyzes a-l,6-glucosidic linkages on these pol>TOers. Starch degradation for 
the production or sweeteners (glucose or maltose) is a very important industrial application 
of this enz\'me. The degradation of starch is developed in two stages. The first stage 
involves the liquefaction of the substrate with a-amylase, and the second stage, or 
saccharification stage, is performed by fi-amylase with pullalanase added as a debranching 
enzyme, to obtain better yields. 

Endoglucanases can be used in a variety of industrial applications. For instance, the 
endoglucanases of the present invention can hydrolyze the internal B-l,4-glycosidic bonds 
in cellulose, which may be used for the conversion of plant biomass into fiiels and 
chemicals. Endoglucanases also have applications in detergent formulations, the textile 
industry, in animal feed, in waste treatment, and in the fruit juice and brewing industry for 
the clarification and extraction of juices. 
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Brief Description of the Drawings 

The following drawings are illustrative of embodiments of the invention and are not 
meant to limit the scope of the invention ar encompassed by the claims. 

Figures la-b are the full-length DNA and corresponding deduced amino acid 
sequence of Mil TL of the present invention. Sequencing was performed using a 378 
automated DNA sequencer for all sequences of the present invention (Applied Biosystems, 
Inc.). 

Figure 2 is an illustration of the full-length DNA and corresponding deduced amino 
acid sequence of OC1/4V-33B/G. 

Figure 3 is an illustration of the full-length DNA and corresponding deduced amino 
acid sequence of F1-12G. 

Figures 4a-b are the full-length DNA and corresponding deduced amino acid 
sequence of 9N2-31B/G. 

Figures 5a-b are the full-length DNA and corresponding deduced amino acid 
sequence of MSB8-6G. 

Figure 6 is the full-length DNA and corresponding deduced amino acid sequence 
of AEDni2RA-18B/G. 

Figures 7a-b are the full-length DNA and corresponding deduced amino acid 
sequence of GC74-22G. 

Figures 8a-b are the full-length DNA and corresponding deduced amino acid 
sequence of VC1-7G1. 

Figures 9a-c are the full-length DNA and corresponding deduced amino acid 

sequence of 37GP1. 

Figures lOa-c are the full-length DNA and corresponding deduced amino acid 
sequence of 6GC2. 

Figures lla-d are the full-length DNA and corresponding deduced amino acid 
sequence of 6GP2. 

Figures 12a-c are the full-length DNA and corresponding deduced amino acid 
sequence of 63GB 1 . 
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Figures 13a-b are the full-length DNA and corresponding deduced amino acid 
sequence of 0C1/4V, 

Figures 14a-e are the ftill-length DNA and corresponding deduced amino acid 
sequence of 6GP3. 

Figures I5a-d are the full-length DNA and corresponding deduced amino acid 

sequence of Thermotoga maritima MSB8-6GP2, 

Figures 16a-c are the full-length DNA and corresponding deduced amino acid 
sequence of Thermotoga maritima MSB8-6GB4. 

Figures 17a-d are the full-length DNA and corresponding deduced amino acid 
sequence of Banki gouldi 37GP4. 

Figures 18a-b are the full-length DNA and corresponding deduced amino acid 
sequence of Pyrococcus furiosus VC1-7EG1. 

SUM>L4RY OF THE IN\TNTION 

In a preferred embodiment of the present invention, there are provided isolated 
nucleic acids (polynucleotides) which encode mature enzymes having the deduced amino 
acid sequences of Figures 1-18 (SEQ ID NOS: 15-28 and 61-64). 

In another embodiment, the invention provides a method for producing a 
polypeptide including culairing host cells containing die polynucleotide of Figures 1-18 and 
expressing from the host ceil a polypeptide encoded by the polynucleotide and isolating the 
polypeptide. 

In another embodiment, the invention provides an enzyme selected from the group 
consisting of an enzyme having an amino acid sequence set forth in SEQ ID NOS: 15-28 
or 61-64 and an enzyme which has at least 30 consecutive amino acid residue as an enzyme 
having an amino acid sequence set forth in SEQ ID NOS: 15-28 or 61-64. 

In yet another embodiment, the invention provides a method for generating glucose 
from soluble cell oligosaccharides which includes contacting a sample containing 
oligosaccharides with an effective amount of an enzyme selected from the group of 
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enzymes having the amino acid sequence set forth in SEQ ID NOS: 15-28, 61-63 and 64 

such that glucose is produced 

The publications discussed herein are provided solely for their disclosure prior to 
the filing date of the present application. Nothing herein is to be construed as an 
admission that the invention is not entitled to antedate such disclosure by virtue of prior 
invention. 

Definitions 

''Monosaccharide", as used herein, refers to a single polyhydroxy aldehyde or 
ketone unit. 

''Oligosaccharide'', as used herein, consist of shon chains of monosaccharide units 
joined together by covaient bonds. Of these, the most abundant are the disaccharides, 
which have two monosaccharide units. 

"Polysaccharide", as used herein, consists of long chains having many 
monosaccharide units. 

The term "gene" means the segment of DNA involved in producing a polypeptide 
chain; it includes regions preceding and following the coding region (leader and trailer) as 
well as intervening sequences (introns) between individual coding segments (exons). 

A coding sequence is "operably linked to" another coding sequence when RNA 
polymerase will transcribe the two coding sequences into a single mRNA, which is then 
translated into a single poh'peptide having amino acids derived from both coding 
sequences. The coding sequences need not be contiguous to one another so long as the 
expressed sequences ultimately process to produce the desired protein. 

'^Recombinant" enzymes refer to enzymes produced by recombinant DNA 
techniques; Le., produced from cells transformed by an exogenous DNA construct encoding 
the desired enzyme. "Synthetic" enzymes are those prepared by chemical synthesis. 

A DNA "coding sequence of or a "nucleotide sequence encoding" a particular 
enzyme, is a DNA sequence which is transcribed and translated into an enzyme when 
placed under the control of appropriate regulatory sequences. 
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Detailed Description of the Invention 

The polynucleotides and polypeptides of the present invention have been identified 
as glucosidases, a-galactosidases, p-galactosidases, B-mannosidases, B-mannanases, 
endoglucanases, and puilalanases as a result of their enzymatic activity. 

In accordance with one aspect of the present invention, there are provided novel 
enzymes, as well as active fragments, analogs and derivatives thereof 

In accordance with another aspect of the present invention, there are provided 
isolated nucleic acid molecules encoding the enz>Tnes of the present invention including 
mRNAs, cDNAs, genomic DNAs as well as active analogs and fragments of such enzymes. 

In accordance with yet a further aspect of the present invention, there is provided 
a process for producing such polypeptides by recombinant techniques comprising cuituring 
recombinant prokar\'otic and/br eukaryotic host cells, containing a nucleic acid sequence 
of the present invention, under conditions promodng expression of said enz>Tnes and 
subsequent recovery of said enzymes. 

In accordance with yet a further aspect of the present invention, there is provided 
a process for utilizing such enzymes, or polynucleotides encoding such enzymes for 
hydrolyzing lactose to galactose and glucose for use in the food processing industry, the 
pharmaceutical industry', for example, to treat intolerance to lactose, as a diagnostic reporter 
molecule, in com wet milling, in the fruit juice industry, in baking, in the textile industry 
and in the detergent industry. 

In accordance with yet a further aspect of the present invention, there is provided 
a process for utilizing such enzymes for hydrolyzing guar gum (a galactomannan 
polysaccharide) to remove non-reducing terminal mannose residues. Further 
polysaccharides such as galactomannan and the enzymes according to the invention that 
degrade them have a variety of applications. Guar gum is commonly used as a thickening 
agent in food and also is utilized in hydraulic fracturing in oil and gas recovery. 
Consequently, mannanases are industrially relevant for the degradation and modification 
of guar gums. Furthermore, a need exists for thermostable mannases that are active in 
extreme conditions associated with drilling and well stimulation. 
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In accordance with ye: a ftirther aspect of the present invention, there are also 
provided nucleic acid probes comprising nucleic acid molecules of sufficient length to 
specifically hybridize to a nucleic acid s-.quence of the present invention. 

In accordance with yet a further aspect of the present invention, there is provided 
a process for utilizing such enzymes, or polynucleotides encoding such enzvmes, for in 
vitro purposes related to scientific research, for example, to generate probes for identifying 
similar sequences which might encode similar enzymes from other organisms by using 
certain regions, i.e.. conserved sequence regions, of the nucleotide sequence. 

These and other aspects of the present invention should be apparent to those skilled 
in the art from the teachings herein. 

The polvTiucieotides of this invention were onginaliy recovered from genomic gene 
libraries derived from the following organisms: 

MllTL is a new species of Desulfiirococcus isolated from Diamond Pool in 
Yellowstone National Park. Tne organism grows optimally at 85-88=C, pH 7.0 in a low salt 
medium contaimng yeast extract, peptone, and gelatin as substrates with a NJCO. gas 
phase. 

0C1/4V is from the genus Thermotoga. The organism was isolated from 
Yellowstone National Park. It grows optimally at 75°C in a low salt medium with cellulose 

as a substrate and N. in gas phase. 

Pyrococcusfuriosus VCl and (7EG1) is from the genus Pyrococcus. VCl was 

isolated from Vulcano, Italy. It grows optimally at lOO'C in a high salt medium (marine) 
containing elemental sulftir, yeast extract, peptone and starch as substrates and N. in gas 
phase. 

Staphylothermus marinus Fl is a from the genus Staphylothermus. Fl was isolated 
from Vulcano, Italy. It grows optimally at 85 = C, pH 6.5 in high salt medium (marine) 
containing elemental sulftir and yeast extract as substrates and N, in gas phase. 
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Thermococcus 9N-2 is from the genus Thermococcus 9N-2 was isolated from 
diffuse vent fluid in the East Pacific Rise. It is a strict anaerobe that grows optimally at 
87°C. 

Thermotoga maritima MSB8 and MSB8 (Clone # 6GP2 and 6GB4) is from the 
genus Thermotogo, and was isolated from Vulcano, Italy. MSB8 grows optimally at 85 °C, 
pH 6.5 in a high salt medium (marine) containing starch and yeast extract as substrates and 
in gas phase. 

Thermococcus alcaliphilus AEDII12RA is from the genus Thermococcus. 
AEDII12RA grows optimally at SS^'C, pH 9.5 in a high salt medium (marine) containing 
polysulfides and yeast extract as substrates and N. in gas phase. 

Thermococcus chitonophagiis GC74 is from the genus Thermococcus, GC74 grows 
optimally at 85 pH 6.0 in a high salt medium (marine) containing chitin, meat extract, 
elemental sulfur and yeast extract as substrates and N. in gas phase, AEPII la grows 
optimally at 85 °C at pH 6.5 in marine medium under anaerobic conditions. It has many 
substrates. Bankia gouldi is from the genus Bankia. 

Accordingly, the polynucleotides and enzymes encoded thereby are identified by 
the organism from which they were isolated, and are sometimes hereinafter referred to as 
"Ml ITL" (Figure 1 and SEQ ID N0S:1 and 15), "OC1/4V-33B/G" (Figure 2 and SEQ ID 
N0S:2 and 16), T1-12G" (Figure 3 and SEQ IDN0S:3 and 17), "9N2-31B/G" (Figure 4 
and SEQ ID N0S:4 and 18), "MSB8" (Figure 5 and SEQ ID N0S:5 and 19), "AEDII12RA- 
18B/G" (Figure 6 and SEQ ID N0S:6 and 20), "GC74-22G" (Figure 7 and SEQ ID N0S:7 
and 21), "VCl-7Gr' (Figure 8 and SEQ ID N0S:8 and 22), "37GPr' (Figure 9 and SEQ 
ID NOS: 9 and 23), "6GC2" (Figure 10 and SEQ ID NOS: 10 and 24), "6GP2" (Figure 1 1 
and SEQ ID N0S:11 and 25), "AEPII la" (Figure 12 and SEQ ID N0S:12 and 26), 
"0C1/4V'* (Figure 13 and SEQ ID N0S:13 and 27), and "6GP3" (Figure 14 and SEQ ID 
NOS:28), "MSB8-6GP2" (Figure 15 and SEQ ID NOS:57 and 61), "MSB8-6GB4"(Figure 
16 and SEQ ID NOS:58 and 62),"VCl'7EGl"(Figure 17 and SEQ ID NOS:59 and 63), 
and 37GP4 (Figure 18 and SEQ ID NOS:60 and 64). 
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The polviiucleotides and polypeptides of the present invention show identity at the 
nucleotide and protein level to known genes and proteins encoded thereby as shown in 
Table 1. 

Table 1 



Clone 


Gene/Protein with 
Closest Homology 


Protein 
Identity 


Nucleic 

Acid 
Identitv 


iVl 1 i 1 L.*Ly\J 


^iilfninhus suifataricus 
DSM 16l6/Pl,p- 
salactosidase 


51% 


55% 


OC1/4V-33B/G 


Caldocellum 
saccharoivticurn 0- 
slucosidase 


52% 


57% 


Staphylo(hermus 
mar i mis F1-12G 


Bacillus poiymyxa, fJ- 
salactosidase 


36% 


48% 


Thermococcus 9N2- 
31B/G 


Sulfolobus sulfataricus 
ATCC49255/MT4, p- 
ealactosidase 


51% 


50% 


Thermotoga maritima 
MSB8-6G 


Clostridium thermocellum 
belB 


45% 


53% 


Thermococcus 
AEDII12RA-18B/G 


Bacillus poiymyxa, P- 
galactosidase 


34% 


48%) 


Thermococcus 
chitonophagus GC74- 
22G 


Sulfolobus sulfataricus. 
ATCC 49255/MT4, P- 
aalactosidase 


46% 


54% 
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VC1-7G1 


Sulfniobiis 
sulfataricus/MT-4 p- 
ealactosidase 


46.4% 


52.5% 


Thermotoga maritima 
(6GC2) 


Pediococcus pentosaceaus 
ct-2aIactosidase 


49% 


29% 


Thermotoga maritima 

R_rmr\n ^1 n 51 ^n^Tp'/^ 


Aspergillus aculeatus 

liiCUiiiCliiUOC 


56% 


37% 


AEPII lafl- 
mannosidase ('63081) 


Sulfolobus solfactaricus B- 
galactosidase 


78% 


56% 


0C1/4V 

endoglucanase 
(33GPn 


Clostridium thermocellum 


65% 


43% 


Thermotoga maritima 
puUalanase (6GP3) 


Caldocelium 
saccharolyticum a- 
destrom 6 
glucanohydralase 


72 


53 


Bankic goiildi mix 

Endoglucanase 

(37GPn 


None available 







The polvnucleotides and enzymes of the present invention show homciogy to each 
other as shown in Table 2. 
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Table 2 



Clone 


Gene/Protein with 
Closest Homology 


Protein 
Identity 


Nucleic 

Acid 
Identity 


Staphylothermus 
marinus F1-12G 


Thermococcus 
AEDII12RA-18B/G, p- 
ealactosidase, 2lucosidase 


55% 


57% 


Thermococcus 9N2- 
31B/G 


Thermococcus 
chitonophagus GC74- 
22G-plucosidase' 


74% 


66% 


Pyrococcus fitriosus 
VC1-7G1 


Pyrococcus furiosus VC 1 - 
7B/G p-ealactosidase 


46.4% 


54% 



All the clones identified in Tables 1 and 2 encode poivpeptides which have a- 
glycosidase or p-glycosidase activit>'. 

This invention, in addition to the isolated nucleic acid molecules encoding the 
enz>'mes of the present invention, also provide substantially similar sequences. Isolated 
nucleic acid sequences are substantially similar if: (i) they are capable of hybridizing under 
conditions hereinafter described, to the polynucleotides of SEQ ID NOS: 1-14 and 57-60; 
(ii) or they encode DNA sequences which are degenerate to the polynucleotides of SEQ ID 
NOS: 1-14 and 57-60. Degenerate DNA sequences encode the amino acid sequences of 
SEQ ID NOS:15-28 and 61-64, but have variations in the nucleotide coding sequences. As 
used herein, substantially similar refers to the sequences having similar identity to the 
sequences of the instant invention. T^e nucleotide sequences that are substantially the same 
can be identified by hybridization or by sequence comparison. Enzyme sequences that are 
substantially the same can be identified by one or more of the following: proteolytic 
digestion, gel elecirophoresis and/or microsequencing. 

One means for isolating the nucleic acid molecules encoding the enzymes of the 
presem invention is to probe a gene library with a natural or artificially designed probe 
using art recognized procedures (see, for example: Current Protocols in Molecular Biology, 
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Ausubel F.M. etai (EDS.) Green Publishing Company Assoc. and John Wiley Interscience, 
New York, 1989, 1992). It is appreciated to one skilled in the art that the polynucleotides 
of SEQ ID NC5: 1-14 and 57-60 or fragments thereof (comprising at least 12 contiguous 
nucleotides), are particularly useful probes. Other particular useful probes for this purpose 
are hybridizable fragments to the sequences of SEQ ID NOS: 1-14 and 57-60 {Le., 
comprising at least 1 2 contiguous nucleotides). 

With respect to nucleic acid sequences which hybridize to specific nucleic acid 
sequences disclosed herein, hybridization may be carried out under conditions of reduced 
stringency, medium stringency or even stringent conditions. As an example of 
oligonucleotide hybridization, a polymer membrane containing immobilized denatured 
nucleic acids is first prehybridized for 30 minutes at 45*^0 in a solution consisting of 0.9 M 
NaCl. 50 mM NaH.PO,, pH 7.0. 5.0 mM NaEDTA, 0.5% SDS, lOX Denhardt's, and 0.5 
mg/ml polNTiboadenylic acid. Approximately 2 X 10^ cpm (specific activity 4-9 X it) 
cpm/ug) of ^-P end-labeled oligonucleotide probe are then added to the solution. After 12- 
16 hours of incubation, the membrane is washed for 30 minutes at room temperature in IX 
SET (150 miVI NaCl, 20 mM Tris hydrochloride, pH 7.8, 1 mM Na.EDTA) containing 0.5% 
SDS, followed by a 30 minute wash in fi-esh IX SET at Tm lOT for the oligonucleotide 
probe. The membrane is then exposed to auto-radiographic film for detection of 
hybridization signals. 

Stringent conditions means hybridization will occur only if there is at least 90% 
identity, preferably at least 95% identity and most preferably at least 97% identity between 
the sequences. Further, it is understood that a section of a 100 bps sequence that is 95 bps 
in length has 95% identity with the 1090 bps sequence from which it is obtained. See J. 
Sambrook et ai, Molecular Cloning, A Laboratory Manual 2d Ed, Cold Spring Harbor 
Laboratory (1989) which is hereby incorporated by reference in its entirety. Also, it is 
understood that a fragment of a 100 bps sequence that is 95 bps in length has 95% identity 
with the 100 bps sequence from which it is obtained. 

As used herein, a first DNA (RNA) sequence is at least 70% and preferably at least 
80% identical to another DNA (RNA) sequence if there is at least 70% and preferably at 
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least a 80% or 9J% identity, respectivelyvbetween the bases of the first sequence and the 
bases of the another sequence, when properly ahgned with each other, for example when 
aligned byBLASTN. 

"Identity" as the term is used herein, refers to a polynucleotide sequence which 
comprises a percentage of the same bases as a reference polynucleotide (SEQ ID N0S:1-14 
and 57-60). For example, a polynucleotide which is at least 90% identical to a reference 
polynucleotide, has polynucleotide bases which are identical in 90% of the bases which 
make up the reference polynucleotide and may have different bases in \ 0% of the bases 
which comprise that polynucleotide sequence. 

The present invention relates polynucleotides which differ from the reference 
polynucleotide such that the changes are silent changes, for example the change do not alter 
the ammo acid sequence encoded by the polynucleotide. The present invention also relates 
to nucleotide changes which result in amino acid substitutions, additions, deletions, fusions 
and truncations in the polypeptide encoded by the reference polynucleotide. In a preferred 
aspect of the invention these pol>i:eptides retain the same biological action as the 
polypeptide encoded by the reference polynucleotide. 

It is also appreciated that such probes can be and are preferably labeled with an 
analytically detectable reagent to facilitate identification of the probe. Useful re.agents 
include but are not limited to radioactivity, fluorescent dyes or enzymes capable of 
catalyzing the formation of a detectable product. The probes are thus useful to isolate 
complementary copies of DNA from other sources or to screen such sources for related 
sequences. 

The polynucleotides of this invention were recovered from genomic gene libraries 
from the organisms listed in Table 1 . For example, gene libraries can be generated in the 
Lambda ZAP II cloning vector (Stratagene Cloning Systems). Mass excisions can be 
performed on these libraries to generate libraries in the pBluescript phagemid. Libraries 
are thus generated and excisions performed according to the protocols/methods hereinafter 
described. 
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The excision libraries are introduced into the £ coli strain BW14893 F'kanlA. 
Expression clones are then identified using a high temperature filter assay. Expression 
clones encoding several glucanases and several other glycosidases are identified and 
repurified. The polynucleotides, and enzymes encoded thereby, of the present invention, 
yield the activities as described above. 

The coding sequences for the enzymes of the present invention were identified by 
screening the genomic DNAs prepared for the clones having glucosidase or galactosidase 
activity. 

An example of such an assay is a high temperature filter assay wherein expression 
clones were identified by use of high temperature filter assays using buffer Z (see recipe 
below) containing 1 mg/ml of the substrate 5-bromo-4-chioro-3-indolyl-P-D- 
glucopyranoside (XGLU) (Diagnostic Chemicals Limited or Sigma) after introducing an 
excision library into the E. coli strain BW 14893 PkanlA. Expression clones encoding 
XGLUases were identified and repurified fi-om MI ITL, 0C1/4V, Pyrococcus fiiriosus VCl, 
Staphylothemus marinus Fl. Thermococcus 9N-2, Thermotoga maritima MSB8, 
Thermococcus alcaliphilus .AEDII12RA, and Thermococcus chitonophagus GC74. 

Z-buffer: (referenced in Miller, J.H. (1992) A Short Course in Bacterial Genetics, 

p. 445.) 

per liter: 

Na.HP04-7H,0 16.1g 
NaH,PO,-7H.O 5.5g 
KCl 0.75g 
MgS04-7H,0 0.246g 
P-mercaptoethanol 2.7ml 
Adjust pH to 7.0 

High Temperature Filter Assav 

(1) The f factor fkan (fi-om E. coli strain CSH118)(1) was introduced into the pho-pnh- 
lac-strain BWl 4893(2). BWl 3893(2). The filamentous phage library was plated 
on the resulting strain, BWl 4893 F'kan. (Miller, J.H. (1992) A Short Course in 
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Bacterial Genetics; Lee, K.S., Metcalf, et al, (1992) Evidence for two phosphonate 
degradative pathways in Enterobacter Aerogenes, J. Bacteriol., 174:2501-2510, 

(2) After growth on 100 mm LB plates containing 100 )ig/ml ampicillin, 80 |ig/ml 
nethicillin and ImM IPTG, colony lifts were performed using Millipore HATF 
membrane filters. 

(3) The colonies transferred to the fillers were lysed with chloroform vapor in 150 mm 
glass petri dishes, 

(4) The filters were transferred to 100 mm glass petri dishes containing a piece of 
Whatman 3 MM filter paper saturated with buffer. 

(a) when testing for galactosidase activity (XGALase), 3 MM paper was 
saturated with Z buffer containing 1 mg/ml XGAL (ChemBridge 
Corporation). After transferring niter bearing lysed colonies to the glass 
petri dish, placed dish in oven at 80-S5''C. 

(b) when testing for glucosidase (XGLUase), 3 MM paper was saturated 
with Z buffer containing 1 mg/ml XGLU. After transferring filter bearing 
lysed colonies to the glass petri dish, placed dish in oven at 80-85 ""C. 

(5) 'Positives' were observed as blue spots on the filter membranes. Used the following 
filter rescue technique to retrieve plasmid from lysed positive colony. Used pasteur 
pipette (or glass capillary tube) to core blue spots on the filter membrane. Placed 
the small niter disk in an Eppendorf tube containing 20 ^il water, hicubated the 
Eppendorf mbe at 75''C for 5 minutes followed by vortexing to elute plasmid DNA 
off filter. This DNA was transformed into electrocompetent £. coli cells DHIOB 
for Thermatoga maritima MSB8-6G, Staphylothermus marinus F1-12G, 
Thermococcus AEDII12RA-18B/G, Thermococcus chitonophagus GC74-22G, 
Ml ITl and 0C1/4V. Electrocompetent B\V14893 F'kanlA E. coli were used for 
Thermococcus 9N2-3 1 B/G, and Pyrococcus furiosus VCl -7G 1 . Repeated filter-lift 
assay on transformation plates to identify 'positives'. Return transformation plates 
to Sy'C incubator after filter lift to regenerate colonies. Inoculate 3 ml LB liquid 
containing 100 |ig/ml ampicillin with repurified positives and incubate at 37°C 
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overnight. Isolate plasmid DNA from these cuhures and sequence plasmid insert. 
In some instances where the plates used for the initial colony lifts contained non- 
confluent colonies, a specific colony corresponding to a blue spot on the filter could 
be identified on a regenerated plate and repurified directly, instead of using the filter 
rescue technique. 

.Another example of such an assay is a variation of the high temperature filter assay 
wherein colony-laden filters are heat-killed at different temperatures (for example, 105°C 
for 20 minutes) to monitor thermostability. The 3 MM paper is saturated with different 
buffers (i.e., 100 mM NaCl, 5 mM MgCU, 100 mM Tris-Cl (pH 9.5)) to determine enzyme 
activity- under different buffer conditions. 

A p-glucosidase assay may also be employed, wherein GlcpPNp is used as an 
artificial substrate (arvi-P-glucosidase). The increase in absorbance at 405 nm as a result 
of p-nitrophenol (pNp) liberation was followed on a Hitachi U-1 100 spectrophotometer, 
equipped with a thenmostaned cuvette holder. The assays may be performed at 80°C or 
90°C in closed 1-ml quartz cuveue. A standard reaction mixture contains 150 wM 
trisodium substrate. pH 5.0 (at 80 °C), and 0.95 mM pNp derivative pNp = 0.561 mM*' cm" 
'). The reaction mixture is allowed to reach the desired temperature, after which the 
reaction is started by injecting an appropriate amount of enzyme (1.06 ml final volume), 

1 U P-glucosidase activity is defined as that amount required to catalyze the 
formation of 1 .0 /^mol pNp/min. D-cellobiose may also be used as a substrate. 

An ONPG assay for p-galactosidase activity^ is described by Miller, J.H. (1992) A ^ 
Short Course in Bacterial Genetics and Mill, J.H. (1992) Experiments in Molecular 
Genetics, the contents of which are hereby incorporated by reference in their entirety. 

A quantitative fluorometric assay for p-galactosidase specific activity is described 
by : Youngman P., (1987) Plasmid Vectors for Recovering and Exploiting Tn917 
Transpositions in Bacillus and other Gram-Positive Bacteria. In Plasmids: A Practical 
approach (ed. K. Hardy) pp 79-103. IRL Press, Oxford. A description of the procedure can 
be found in Miller (1992) p. 75-77, the contents of which are incorporated by reference 
herein in their entirety. 
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The poiVTiucleotides of the present invention may be in the form of DNA which 
DNA includes cDNA, genomic DNA, and synthetic DNA. The DNA may be double- 
stranded or single-stranded, and if single stranded may be the coding strand or non-coding 
(anti-sense) strand. The coding sequences which encodes the mature enzymes may be 
identical to the coding sequences shown in Figures 1-8 (SEQ ID NOS: 1-14 and 57-60) or 
may be a different coding sequence which coding sequence, as a result of the redundancy 
or degeneracy of the genetic code, encodes the same mature enzymes as the DNA of 
Figures I-I8 (SEQ ID NOS: 1-14 and 57-60). 

The polynucleotide which encodes for the mature enzyme of Figures I - 1 8 (SEQ ID 
NOS: 15-28 and 61-64) may include, but is not limited to: only the coding sequence for the 
mature enzyme; the coding sequence for the mature enzyme and additional coding sequence 
such as a leader sequence or a proprotein sequence; the coding sequence for the mature 
enz>Tne (and optionally additional coding sequence) and non-coding sequence, such as 
introns or non-coding sequence 5' and/or 3' of the coding sequence for the mature enzyme. 

Thus, the term "polvTiucleotide encoding an enzyme (protein)" encompasses a 
polynucleotide which includes only coding sequence for the enzyme as well as a 
polynucleotide which includes additional coding and/or non-coding sequence. 

The present invention further relates to variants of the hereinabove described 
polynucleotides which encode for fragments, analogs and derivatives of the enzymes having 
the deduced amino acid sequences of Figures 1-18 (SEQ ID NOS: 15-28 and 61-64). The 
variant of the polynucleotide may be a naturally occurring allelic variant of the 
polynucleotide or a non-naturally occurring variant of the polynucleotide. 

Thus, the present invention includes polynucleotides encoding the same mature 
enzymes as shown in Figures 1-18 (SEQ ID NOS: 15-28 and 61-64) as well as variants of 
such polynucleotides which variants encode for a fragment, derivative or analog of the 
enzymes of Figures 1-18 (SEQ ID NOS: 15-28 and 61-64). Such nucleotide variants 
include deletion variants, substitution variants and addition or insertion variants. 

As hereinabove indicated, the polynucleotides may have a coding sequence which 
is a naturally occurring allelic variant of the coding sequences shown in Figures 1-18 (SEQ 
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ID NOS: 1-14 and 57-60). As known in the art, an allelic variant is an alternate form of a 
polynucleotide sequence which may have a substitution, deletion or addition of one or more 
nucleotides, which does not substantially alter the function of the encoded enzyme. 

Fragments of the flili length gene of the present invention may be used as a 
hybridization probe for a cDNA or a genomic iibrar\' to isolate the full length DNA and to 
isolate other DNAs which have a high sequence similarity to the gene or similar biological 
activity. Probes of this type preferably have at least 10, preferably at least 15, and even 
more preferablv at least 30 bases and may contain, for example, at least 50 or more bases. 
The probe may also be used to identify a DNA clone corresponding to a full length 
transcript and a genomic clone or clones that contain the complete gene including 
regulatory and promoter regions, exons, and introns. An example of a screen comprises 
isolating the coding region of the gene by using the known DNA sequence to s>Tithesize an 
olisonucleotide probe. Labeled oligonucleotides having a sequence complementary to that 
of the gene of the present invention are used to screen a library of genomic DNA to 
determine which members of the library the probe hybridizes to. 

The present invention further relates to polynucleotides which hybridize to the 
hereinabove-described sequences if there is at least 70%, preferably at least 90%, and more 
preferably at least 95% identity between the sequences. The present invention particularly 
relates to polynucleotides which hybridize under stringent conditions to the hereinabove- 
described polynucleotides. As herein used, the term "stringent conditions'* means 
hybridization will occur only if there is at least 95% and preferably at least 97% identity 
between the sequences. The polynucleotides which hybridize to the hereinabove described 
polynucleotides in a preferred embodiment encode enzymes which either retain 
substantially the same biological function or activity as the mature enzyme encoded by the 
DNA of Figures 1-18 (SEQ ID NOS: 1-14 and 57-60). 

Ahematively, the polynucleotide may have at least 15 bases, preferably at least 30 
bases, and more preferably at least 50 bases which hybridize to any part of a polynucleotide 
of the present invention and which has an identity thereto, as hereinabove described, and 
which may or may not retain activity. For example, such polynucleotides may be employed 
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as probes for the polynucleotides of SEQ ID NOS; M4 and 57-60, for example, for 
recovery of the polynucleotide or as a diagnostic probe or as a PGR primer 

Thus, the present invention is directed to polynucleotides having at least a 70% 
identity, preferably at least 90% identity and more preferably at least a 95% identity to a 
polynucleotide which encodes the enzymes of SEQ ID NOS: 15-28 and 61-64 as well as 
fragments thereof which fragments have at least 1 5 bases, preferably at least 30 bases and 
most preferably at least 50 bases, which fragments are at least 90% identical, preferably at 
least 95% identical and most preferably at least 97% identical under stringent conditions 
to any portion of a pol>Tiucleotide of the present invention. 

The present invention further relates to enzymes which have the deduced amino acid 
sequences of Figures 1-18 (SEQ ID NOS: 15-28 and 61-64) as well as fragments, analogs 
and derivatives of such enzyme. 

The terms "fragment," "derivative*' and "analog" when referring to the enzymes of 
Figures 1-18 (SEQ ID NOS: 15-28 and 61-64) means enzymes which retain essentially the 
same biological function or activit\' as such enzymes. Thus, an analog includes a proprotein 
which can be activated by cleavage of the proprotein portion to produce an active mature 
enzyme. 

The enzymes of the present invention may be a recombinant enzyme, a natural 
enzyme or a synthetic enzyme, preferably a recombinant enzyme. 

The fragment, derivative or analog of the enzymes of Figures 1-18 (SEQ ID NOS: 
15-28 and 61-64) may be (i) one in which one or more of the amino acid residues are 
substituted with a conserved or non-conserved amino acid residue (preferably a conserved 
amino acid residue) and such substituted amino acid residue may or may not be one 
encoded by the genetic code, or (ii) one in which .one or more of the amino acid residues 
includes a substituent group, or (iii) one in which the mature enzyme is fused with another 
compound, . such as a compound to increase the half-life of the enzyme (for example, 
polyethylene glycol), or (iv) one in which the additional amino acids are fused to the mature 
enzyme, such as a leader or secretory sequence or a sequence which is employed for 
purification of the mature enzyme or a proprotein sequence. Such fragments, derivatives 
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and analogs are deemed to be within the scope of those skilled in the art from the teachings 
herein. 

The enrvTnes and polynucleotides of the present invention are preferably provided 
in an isolated form, and preferably are purified to homogeneity. 

The term "isolated" means that the material is removed from its original 
environment (e.g., the natural environment if it is naturally occurring). For example, a 
naturally-occurring polynucleotide or enzyme present in a living animal is not isolated, but 
the same polynucleotide or enzyme, separated from some or all of the coexisting materials 
in the natural system, is isolated. Such polynucleotides could be part of a vector and/or 
such poKnucleotides or enzymes could be part of a composition, and still be isolated in that 
such vector or composition is not pan of its natural environment. 

The enzymes of the present invention include the enzymes of SEQ ID NOS: 15-28 
and 61-64 (in particular the mature enzyme) as well as enzymes which have at least 70% 
similariw (preferably at least 70% identitv-) to the enz>7nes of SEQ ID NOS : 1 5-28 and 6 1 - 
64 and more preferably at least 90% similarity (more preferably at least 90% identity) to 
the enz>7nes of SEQ ID NOS: 15-28 and 61-64 and still more preferably at least 95% 
similarity (still more preferably at least 95% identit>') to the enzymes of SEQ ID NOS: 15- 
28 and 61-64 and also include portions of such enzymes with such portion of the enz>Tiie 
generally containing at least 30 amino acids and more preferably at least 50 amino acids. 

As known in the art "similarit>'" between tvvo enzymes is determined by comparing 
the amino acid sequence and its conserved amino acid substitutes of one enzyme to the 
sequence of a second enzyme. 

A variant, i.e. a "fragment", "analog" or "derivative" polypeptide, and reference 
polypeptide may differ in amino acid sequence by one or more substitutions, additions, 
deletions, fusions and truncations, which may be present in any combination. 

Among preferred variants are those that vary from a reference by conservative 
amino acid substitutions. Such substitutions are those that substitute a given amino acid in 
a polypeptide by another amino acid of like characteristics. Typically seen as conservative 
substitutions are the replacements, one for another, among the aliphatic amino acids Ala, 
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Val, Leu and II interchange of the hydroxyl residues Ser and Thr, exchange of the acidic 
residues Asp and Glu, substitution between the amide residues Asn and Gin, exchange of 
the basic residues Lys and Arg and replacements among the aromatic residues Phe, Tyr. 

Most highly preferred are variants which retain the same biological function and 
activity as the reference polypeptide from which it varies. 

Fragments or portions of the enzymes of the present invention may be employed for 
producing the corresponding full-length enzyme by peptide synthesis: therefore, the 
fragments may be employed as intermediates for producing the full-length enzymes. 
Fragments or portions of the polynucleotides of the present invention may be used to 
synthesize full-length polynucleotides of the present invention. 

The present invention also relates to vectors which include polynucleotides of the 
present invention, host cells which are genetically engineered with vectors of the invention 
and the production of enzymes of the invention by recombinant techniques. 

Host cells are genetically engineered (transduced or transformed or transfected) with 
the vectors of this invention which may be, for example, a cloning vector or an expression 
vector. The vector may be, for example, in the form of a plasmid, a viral particle, a phage, 
etc. The engineered host cells can be cultured in conventional nutrient media modiiled as 
appropriate for activating promoters, selecting transformants or amplifying the genes of the 
present invention. The culture conditions, such as temperature, pH and the like, are those 
previously used with the host ceil selected for expression, and will be apparent to the 
ordinarily skilled anisan. 

The polynucleotides of the present invention may be employed for producing 
enzymes by recombinant techniques. Thus, for example, the polynucleotide may be 
included in any one of a variety of expression vectors for expressing an enzyme. Such 
vectors include chromosomal, nonchromosomal and synthetic DNA sequences, e.g., 
derivatives of SV40; bacterial plasmids; phage DNA; baculovirus; yeast plasmids; vectors 
derived from combinations of plasmids and phage DNA, viral DNA such as vaccinia, 
adenovirus, fowl pox virus, and pseudorabies. However, any other vector may be used as 
long as it is replicable and viable in the host. 

22 



wo 98/24799 



PCT/US97/22623 



The appropriate DNA sequence may be inserted into the vector by a variety of 
procedures. In general, the DNA sequence is inserted into an appropriate restriction 
endonuclease site(s) by procedures known in the art. Such procedures and others are 
deemed to be within the scope of those skilled in the art. 

The DNA sequence in the expression vector is operatively linked to an appropriate 
expression control sequence(s) (promoter) to direct mRNA synthesis. As representative 
examples of such promoters, there may be mentioned: LTR or SV40 promoter, the E. coli. 
lac or trp , the phage lambda promoter and other promoters known to control expression 
of genes in prokaryotic or eukaryotic cells or their viruses. The expression vector also 
contains a ribosome binding site for translation initiation and a transcription terminator. 
The vector may also include appropriate sequences for amplifying expression. 

In addition, the expression vectors preferably contain one or more selectable marker 
genes to provide a phenocypic trait for selection of transformed host cei:5 such as 
dihydrofolate reductase or neomycin resistance for eukaryotic cell culture, or such as 
tetracycline or ampicillin resistance in E. coli . 

The vector containing the appropriate DNA sequence as hereinabove described, as 
well as an appropriate promoter or control sequence, may be employed to transform an 
appropriate host to permit the host to express the protein. 

As representative examples of appropriate hosts, there may be mentioned: bacterial 
cells, such as E. coli , Streptomvces , Bacillus subtilis : fungal cells, such as yeast; insect cells 
such as Drosophila S2 and Spodoptera Sf9 ; animal cells such as CHO, CO.S or Bowes 
melanoma; adenoviruses; plant cells, etc. The selection of an appropriate host is deemed 
to be within the scope of those skilled in the art from the teachings herein. 

More particulariy, the present invention also includes recombinant constructs 
comprising one or more of the sequences as broadly described above. The constructs 
comprise a vector, such as a plasmid or viral vector, into which a sequence of the invention 
has been inserted, in a forward or reverse orientation. In a preferred aspect of this 
embodiment, the construct further comprises regulatory sequences, including, for example, 
a promoter, operably linked to the sequence. Large numbers of suitable vectors and 
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promoters are kno\\Ti to those of skill in the art. and are commercially available. The 
following vectors are provided by way of example; Bacterial: pQE70, pQE60, pQE-9 
(Qiagen), pDlO, psiX174, pBluescript II KS, pNH8A, pNH16a, pNHlSA, pNH46A 
(Stratagene); ptrc99a, pKK223-3, pKK233-3, pDR540, pRIT5 (Pharmacia); Eukaryotic: 
pSV2CAT, pOG44, pXTl, pSG (Stratagene) pSVK3, pBPV, pMSG, pSVL (Pharmacia). 
However, any other plasmid or vector may be used as long as they are replicable and viable 
in the host. 

Promoter regions can be selected from any desired gene using CAT 
(chloramphenicol transferase) vectors or other vectors with selectable markers. Two 
appropriate vectors are pKX232-8 and pCM7. Particular named bacterial promoters include 
lacL lacZ, T3, T7, gpt, lambda Pf^, Pl and trp. Eukaryotic promoters include C^rV' 
immediate early, HSV thymidine kinase, eady and late SV40, LTRs from retrovirus, and 
mouse metallothionein-L Selection of the appropriate vector and promoter is well within 
the level of ordinar>' skill in the art. 

In a further embodiment, the present invention relates to host cells containing the 
above-described constructs. Pne host cell can be a higher eukaryotic cell, such as a 
mammalian cell, or a lower eukar\'otic cell, such as a yeast cell, or the host cell can be a 
prokaryotic cell, such as a bacterial cell. Introduction of the construct into the host cell can 
be effected by calcium phosphate transfection, DEAE-Dextran mediated transfection, or 
electroporation (Davis, L., Dibner, M., Battey, L, Basic Methods in Molecular Biology, 
(1986)). 

The constructs in host cells can be used in a conventional manner to produce the 
gene product encoded by the recombinant sequence. Alternatively, the enzymes of the 
invention can be synthetically produced by conventional peptide svTithesizers. 

Mature proteins can be expressed in mammalian cells, yeast, bacteria, or other cells 
under the control of appropriate promoters. Cell-free translation systems can also be 
employed to produce such proteins using RNAs derived from the DNA constructs of the 
present invention. Appropriate cloning and expression vectors for use with prokaryotic and 
eukaryotic hosts are described by Sambrook, et al,. Molecular Cloning: A Laboratory 
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Manual. Second Edition, Cold Spring Harbor, N.Y., (1989), the disclosure of which is 
hereby incorporated by reference. 

Transcription of the DNA encoding the enz>'mes of the present invention by higher 
eukaryotes is increased by inserting an enhancer sequence into the vector. Enhancers are 
cis-acting elements of DNA, usually about from 10 to 300 bp that act on a promoter to 
increase its transcription. Examples include the SV40 enhancer on the late side of the 
replication origin bp 100 to 270, a cytomegalovirus eariy promoter enhancer, the polyoma 
enhancer on the late side of the replication origin, and adenovirus enhancers. 

Generally, recombinant expression vectors will include origins of replication and 
selectable markers permitting transformation of the host cell, e.g., the ampicillin resistance 
gene of E. coli and S. cerevisiae TRPl gene, and a promoter derived from a highly- 
expressed gene to direct transcription of a dowTistreani structural sequence. Such promoters 
can be derived from operons encoding glycolvtic enzymes such as 3-phosphoglycerate 
kinase (PGK), a-factor, acid phosphatase, or heai shock proteins, among others. The 
heterologous structural sequence is assembled in appropriate phase with translation-- 
initiation and termination sequences, and preferably, a leader sequence capable of directing 
secretion of translated enzyme. Optionally, the heterologous sequence can encode a flision 
enzyme including an N-terminal identification peptide imparting desired characteristics, 
e.g., stabilization or simplified purification of expressed recombinant product. 

Useful expression vectors for bacterial use are constructed by insening a structural 
DNA sequence encoding a desired protein together with suitable translation initiation and 
termination signals in operable reading phase with a flmctional promoter. The vector will 
comprise one or more phenotypic selectable markers and an origin of replication to ensure 
maintenance of the vector and to, if desirable, provide amplification within the host. 
Suitable prokaryotic hosts for transformation include E. coli . Bacillus subtilis, Salmonella 
tvphimurium and various species within the genera Pseudomonas, Streptomyces, and 
Staphylococcus, although others may also be employed as a matter of choice. 

As a representative but nonlimiting example, useful expression vectors for bacterial 
use can comprise a selectable marker and bacterial origin of replication derived from 
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commercially available plasmids comprising genetic elements of the well known cloning 
vector pBRj22 (ATCC 37017). Such commercial vectors include, for example, pKK223o 
(Pharmacia. Fine Chemicals, Uppsala. Sweden) and GEMl (Promega Biotec, Madison, WI, 
USA). These pBR322 "backbone" sections are combined with an appropriate promoter and 
the structural sequence to be expressed. 

Following transformation of a suitable host strain and growth of the host strain to 
an appropriate cell densit>', the selected promoter is induced by appropriate means (e.g., 
temperature shift or chemical induction) and cells are cultured for an additional period. 

Cells are typically harvested by centrifugation, disrupted by physical or chemical 
means, and the resulting crude extract retained for further purification. 

Microbial cells employed in expression of proteins can be disrupted by any 
convenient method, including freeze-thaw cycling, sonication, mechanical disruption, or 
use of cell lysing agents, such methods are well know'n to those skilled in the art. 

Various mammalian cell culture systems can also be employed to express 
recombinant protein. Examples of mammalian expression systems include the COS-7 lines 
of monkey kidney fibroblasts, described by Gluzman, Cell, 23:175 (1981), and other cell 
lines capable of expressing a compatible vector, for example, the C127, 3T3, CHO, HeLa 
and BHK cell lines. Mammalian expression vectors will comprise an origin of replication, 
a suitable promoter and enhancer, and also any necessary ribosome binding sites, 
polyadenylation site, splice donor and acceptor sites, transcriptional termination sequences, 
and 5' flanking nontranscribed sequences. DNA sequences derived from the SV40 splice, 
and polyadenylation shes may be used to provide the required nontranscribed genetic 
elements. 

The enzyme can be recovered and purified from recombinant cell cultures by 
methods including ammonium sulfate or ethano! precipitation, acid extraction, anion or 
cation exchange chromatography, phosphocellulose chromatography,^ hydrophobic 
interaction chromatography, affinity chromatography, hydroxyl apatite chromatography and 
lectin chromatography. Protein refolding steps can be used, as necessary, in completing 
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configuration of the mature protein. Finally, high performance liquid chromatography 
(HPLC) can be employed for final purification steps. 

The enzymes of the present invention may be a naturally purified product, or a 
product of chemical synthetic procedures, or produced by recombinant techniques fi-om a 
prokaryotic or eukaryotic host (for example, by bacterial, yeast, higher plant, insect and 
mammalian cells in culture). Depending upon the host employed in a recombinant 
production procedure, the enzymes of the present invention may be glycosylated or may be 
non-glycosylated. Enzymes of the invention may or may not also include an initial 
methionine amino acid residue. 

P-galactosidase hydrolyzes lactose to galactose and glucose. Accordingly, the 
0C1/4V, 9N2-31B/G, AEDII12R-'\-18B/G and F1-12G enzymes may be employed in the 
food processing industry for the production of low lactose content milk and for the 
production of galactose or glucose ft*om lactose contained in whey obtained in a large 
amount as a by-product in the production of cheese. Generally, it is desired that enzymes 
used in food processing, such as the aforementioned p-gaiactosidases, be stable at elevated^ 
temperatures to help prevent microbial contamination. 

These enzymes may also be employed in the pharmaceutical industry. The enzymes 
are used to treat intolerance to lactose. In this case, a thermostable enzNine is desired, as 
well. Thermostable P-galactosidases also have uses in diagnostic applications, where they 
are employed as reporter molecules. 

Glucosidases act on soluble celloohgosaccharides firom the non-reducing end to give 
glucose as the sole product. Glucanases (endo- and exo-) act in the depolymerization of 
cellulose, generating more non-reducing ends (endo-glucanases, for instance, act on internal 
linkages yielding cellobiose, glucose and cellooligosaccharides as products). p- 
glucosidases are used in applications where glucose is the desired product. Accordingly, 
MllTL, FM2G, GC74-22G, MSB8-6G , 0C1/4V, VC1-7G1, 9N2-31B/G and 
AEDII12RA18B/G may be employed in a wide variety of industrial applications, including 
in com wet milling for the separation of starch and gluten, in the fruit industry for 
clarification and equipment maintenance, in baking for viscosity reduction, in the textile 
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industry for the processing of blue jeans, and in the detergent industry as an additive. For 
these and other appHcations, thermostable enzymes are desirable. 

Antibodies generated against the enzymes corresponding to a sequence of the 
present invention can be obtained by direct injection of the enzymes into an animal or by 
administering the enzymes to an animal, preferably a nonhuman. The antibody so obtained 
will then bind the enzymes itself. In this manner, even a sequence encoding only a 
fragment of the enzymes can be used to generate antibodies binding the whole native 
enzymes. Such antibodies can then be used to. isolate the enzyme from cells expressing that 
enzyme. 

For preparation of monoclonal antibodies, any technique which provides antibodies 
produced by continuous cell line cultures can be used. Examples include the hybridoma 
technique (Kohler and Milsiein. 1975, Nature, 256:495-497), the trioma technique, the 
human B-cell hybridoma technique (Kozbor et al., 1 983, Immunology Today 4:72), and the 
EBV-hybridoma technique to produce human monoclonal antibodies (Cole, et al., 1985, in 
Monoclonal .^tibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 

Techniques described for the production of single chain antibodies (U.S. Patent 
4,946.778) can be adapted to produce single chain antibodies to immunogenic enzyme 
products of this invention. Also, transgenic mice may be used to express humanized 
antibodies to inamunogenic enzyme products of this invention. 

Antibodies generated against the enzyme of the present invention may be used in 
screening for similar enzymes from other organisms and samples. Such screening 
techniques are known in the art, for example, one such screening assay is described in 
"Methods for Measuring Cellulase Activities", Methods in enzymology, Vol 160, pp. 87- 
1 16, which is hereby incorporated by reference in its entirety. 

The present invention will be further described with reference to the following 
examples; however, it is to be understood that the present invention is not limited to such 
examples. All parts or amounts, unless otherwise specified, are by weight. 

In order to facilitate understanding of the following examples cenain frequently 
occurring methods and/or terms will be described. 
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"Plasmids" are designated by a lower case p preceded and/or followed by capital 
letters and/or numbers. The starting plasmids herein are either commercially available, 
publicly available on an unrestricted basis, or can be constructed from available plasmids 
in accord with published procedures. Ln addition, equivalent plasmids to those described 
are known in the art and will be apparent to the ordinarily skilled anisan. 

"Digestion" of DNA refers to catalytic cleavage of the DNA with a restriction 
enzyme that acts only at certain sequences in the DNA. The various restriction enzymes 
used herein are commercially available and their reaction conditions, cofactors and other 
requirements were used as would be known to the ordinarily skilled artisan. For analytical 
purposes, typically 1 |ig of plasmid or DNA fragment is used with about 2 units of enzyme 
in about 20 ul of buffer solution. For the purpose of isolating DNA fragments for piasmid 
construction, t\T3ically 5 lo 50 ug of DNA are digested with 20 to 250 units of enz>Tne in 
a larger volume. Appropriate buffers and substrate amounts for panicular resiriction 
enzymes are specified by the manufacturer. Incubation times of about 1 hour at 31''C are 
ordinarily used, but may vary in accordance with the supplier's instructions. After digestion 
the reaction is electrophoresed directly on a polyacryiamide gel to isolate the desired 
fragment. 

Size separation of the cleaved fragments is performed using 8 percent 
polyacryiamide gel described by GoeddeL D. e(al.. Nucleic Acids Res., 8:4057 (1980). 

"Oligonucleotides" refers to either a singlj stranded polydeoxynucleotide or two 
complementar>' polydeox\TLUcleotide strands which may be chemically synthesized. Such 
synthetic oligonucleotides have no 5' phosphate and thus will not ligate lo another 
oligonucleotide without adding a phosphate with an ATP in the presence of a kinase. A 
synthetic oligonucleotide will ligate to a fragment that has not been dephosphorylated. 

"Ligation" refers to the process of forming phosphodiester bonds between two 
double stranded nucleic acid fragments (Maniatis, T., et al., Id., p. 146). Unless otherwise 
provided, ligation may be accomplished using known buffers and conditions with 10 units 
of T4 DNA ligase ("ligase") per 0.5 |ig of approximately equimolar amounts of the DNA 
fragments to be ligated. 
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Unless otherwise stated, transformation was performed as described in the method 
of Graham, F. and Van der Eb, A., Virology, 52:456-457 (1973). 

Example 1 

Bacterial Expression and Purification of Glvcosidas e Enzvmes 

DNA encoding the enzymes of the present invention, SEQ ID NOS: 1-14 and 57-60 
were initially amplified from a pBluescript vector containing the DNA by the PGR 
technique using the primers noted herein. The amplified sequences were then inserted into 
the respective PQE vector listed beneath the primer sequences, and the enzyme was 
expressed according to the protocols set forth herein. The 5' and 3' primer sequences for 
the respective genes are as follows: 

Thermococcus AEDII12R.A -18B/G 

5XCGAGAATTCATT^\.AGAGGAGAAArrA.\CTATGGTGAATGCTATGATTGTC 3' (SEQ !DNO:29) 
3' CGG.AAGATCTrCATAGCTCCGGAAGCCCATA 5' (SEQ ID NO:30) 

Vector: pQE12; and contains the following restriction enzv-me sites 5' EcoRI and 3' Big 
IL 

OC1/4V-33B/G 

5' CCGAGAArrCATT.AA.AGAGGAG/\A\TT,AACTATGATA.AG.AAGGTCCGATTTTCC 3' 
(SEQ ID N0:3l) 

3' CGGAAGATCTTTAAGATTTTAG.'V\ATTCCTT 5' (SEQ ID NO:32) 

Vector: pQE12; and contains the following restriction enzyme sites 5' EcoRI and 3' Bgl 
II. 

Thermococcus 9N2 - 31B/G 

5' CCGAGAATrCArrAA.\GAGGAGAAATTAACTATGCTACCAGAAGGCTTTCTC 3' 
(SEQIDNO;33) 

3' CGGAGGTACCTCACCCAAGTCCGAACTTCTC 5' (SEQ ID NO:34) 

Vector: pQE30; and contains the following restriction enzyme sites 5' EcoRI and 3' 
KpnI. 
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Staphylothermus marinusYX - 12G 

5' CCGAGAArrCATTAAAGAGGAGAAATT/UCTATGATAAGGTTTCCTGATTAT 3' 
(SEO IDNO:35) 

3' CGGAAGATCTTTATTCGAGGTrCTTTAATCC 5' (SEQ ID NO:36) 

Vector: pQE12; and contains the following restriction enzyme sites 5' EcoRI and 3' Bgl 
II. 

Thermococcus chitonophagus GC74 - 22G 

5' CCGAGAATTCATTCATTAAAGAGGAGAAy\TTAACTATGCTTCCAGGAGA.ACTTTCTC 3' 
(SEQ ID NO:37) 

3' CGGACGATCCCTACCCCTCCTCTAAGATCTC 5' (SEQ ID NO:38) 

Vector: pQE12; and contains the following restriction enzyme sites 5* EcoRI and 3' 
BamHI. 

MllTL 

5' A.'\TAATCTAGAGCATGC.AATTCCCC.Vl-\GACTTCATGATAG 3' (SEQ !D \0:39) 
3' AATA.AA.^GCTTACTGGATCAGTGT.\AGATGCT 5' {SEQ ID NO:40) 

Vector: pQE70; and contains the following restriction enzyme sites 5' SphI and 3* Hind 

ni. 

Thermotoga maritima MSB8-6G 

5' CCGACAATTGATTAAAGAGGAGAAA'ITAACTATGGAAAGGATCGATGAAATr 3' (SEQ ID N0:41 ) 
3' CGGAGGTACCTCATGGTTTGAATCTCTTCTC 5' (SEQ ID NO:42) 

Vector: pQE12; and contains the following restriction enzyme sites 5' EcoRI ajid 3' 
Kpnl. 

PyrococcusfuriosiisVCl - 7G1 

5' CCGACAATTGArr.\AAGAGGAG.AAArrA.ACl>TGTTCCCTGAAAAGrrCCTT 3' (SEQ ID NO:43) 
3' CGGAGGTACCTCATCCCCTCAGCAATTCCTC 5' (SEQ ID NO:44) 

Vector: pQE12; and contains the following restriction enzyme sites 5' EcoRI and 3' Kpn 
1. 
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Bdnkia gouldi endoglucanase (37GP1) 

5' .VVTAAGGATCCGrrTAGCGACGCTCGC 3' (SEQ ID NO:45) 

3' AATAAAAGCTTCCGGGTTGTACAGCGGT.AATAGGC 5' (SEQ ID NO;46) 

Vector: pQE52; and contains the following restriction enzyme sites 5' Bam HI and 3' 
5 Hind m. 

Thermotoga maritima a-galactosidase (6GC2) 

5' TTTATTGAATTCATTAAAGAGGAGAAATT-AACTATGATCTGTGTGGAAATATTCGGAAAG 3' 
(SEQ ID NO:47) 

3' TCTATAAAGCmCATTCTCTCTCACCCTCTTCGTAGAAG 5' (SEQ ID NO:48) 

!0 Vector: pQET; and contains the following restriction enzyme sites 5' EcoRI and 3' Hind 

III. 

Thermotoga maritima B-mannanase (6GP2) 

yJJ^^^J^Q^^yJQ^yT,AAAGAGGAGAAATT,^\-ACTATGGGGATTGGTGGCGACG 3* 
{SEQ 1DN0:49) 

15 3' TTTATT.VVGCTTATCTTTTCATATTCACATACCTCC 5' (SEQ !D NO;50) 

Vector: pQEt; and contains the following restriction enzyme sites 5' Hind III and 3' 
EcoRI. 

AEPII la B-mannanase (63GBI) 

5' TTTArrGAATTCATT.\AAGAGGAG.'\A.\TTA-\CTATGCTACCAGA.\GAGTTCCTATGGGG 3' 

20 (SEQIDN0:51) 

3*TTTArrAAGCTrCTCATC.\ACGGCTATGGTCTrCATTTC 5' (SEQ ID NO:52) 

Vector: pQEt; and contains the following restriction enzyme sites 5' Hind III and 3' 
EcoRI. 

0C1/4V endoglucanase (33GP1) 

25 5\\A.W^CAATTG.AATTCATr/WVGAGGAGAAATTAACTATGGTAGAAAGACACTTCAGATAT 
3' (SEQIDNO:53) 

3' TTrTTCGGATCCAArrCTTCATTTACTCTTTGCCTG 5' (SEQ ID NO:54) 
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Vector: pQEt; and contains the following restriction enzv-me sites 5' BamHI and 3' 
EcoRI. 

Thermotoga maritima pullalanase (6GP3) 
(SEQ IDNO:55) 

3' ATAAGAAGCTTTTCACTCTCTGTACAGAACGTACGC 5' (SEQ ID NO:56) 

Vector: pQEt; and contains the following restriction enzjine sites 5' EcoRI and 3' Hind 

m. 

The restriction enzyme sites indicated correspond to the restriction enzyme sites on 
the bacterial expression vector indicated for the respective gene (Qiagen, Inc. Chatsworth, 
CA). The pQE vector encodes antibiotic resistance (AmpO- a bacterial origin of replication 
(on), an IPTG-regulatable promoter operator (P/0), a ribosome bindmg site (RBS), a 6-His 
tag and restriction enzyme sites. 

The pQE vector was digested with the restriction enzymes indicated. The amplified 
sequences were ligated into the respective pQE vector and inserted in frame with the 
sequence encoding for the RBS. The ligation mixture was then used to transform the E. coli 
strain M15/pREP4 (Qiagen, Inc.) by electroporation. Ml 5/pREP4 contains multiple copies 
of the plasmid pREP4, which expresses the lad repressor and also confers kanamycin 
resistance (Kan^. Transformants were identified by their ability to grow on LB plates and 
ampicillin/kanamycin resistant colonies were selected. Plasmid DNA was isolated and 
confirmed by restriction analysis. Clones containing the desired constructs were grown 
overnight (0/N) in liquid culture in LB media supplemented with both .\mp (100 ug/ml) 
and Kan (25 ug/ml). The 0/N culture was used to inoculate a large culture at a ratio of 
1 :100 to 1 :250. The cells were grown to an optical density 600 (O.D.^^°) of between 0.4 and 
0.6. IPTG ("Isopropyl-B-D-thiogalacto pyranoside") was then added to a final 
concentration of 1 mM. IPTG induces by inactivating the lad repressor, clearing the P/0 
leading to increased gene expression. Cells were grown an extra 3 to 4 hours. Cells were 
then harvested by centrifugation. 
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The prim :r sequences set out above may also be employed to isolate the target gene 
from the deposited material by hybridization techniques described above. 

Example 2 

Isolation of A Selected Clone From the Deposited genomic clones 

A clone is isolated directly by screening the deposited material using the 
oligonucleotide primers set forth in Example 1 for the particular gene desired to be 
isolated. The specific oligonucleotides are synthesized using an Applied Biosystems 
DNA synthesizer. The oligonucleotides are labeled with ^-P- -ATP using T4 
polynucleotide kinase and purified according to a standard protocol (Maniatis et ai., 
Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press, Cold Spring, NY, 
1982). The deposited clones in the pBluescript vectors may be employed to transform 
bacterial hosts which are then piated on 1 .5% agar plates to the density of 20,000- 
50,000 pfu/l 50 mm plate. These plates are screened using Nylon membranes according 
to the standard screening protocol (Straragene, 1993 ). Specifically, the Nylon 
membrane with denamred and fixed DNA is prehybndized in 6 x SSC, 20 nuVI 
NaH.PO;, 0.4%SDS, 5 x Denhardt's 500 |ig/ml denatured, sonicated salmon sperm 
DNA; and 6 x SSC, 0.1% SDS. After one hour of prehybridization, the membrane is 
hybridized with hybridization buffer 6xSSC, 20 mM NaH.PO,, 0.4%SDS, 500 ug/ml 
denatured, sonicated salmon sperm DNA with 1x10' cpm/ml ^'P-probe overnight at 
42'^C. The membrane is washed at 45-50°C with washing buffer 6 x SSC, 0.1% SDS 
for 20-30 minutes dried and exposed to Kodak X-ray film overnight. Positive clones are 
isolated and purified by secondary and tertiary screening. The purified clone is 
sequenced to verify its identity to the primer sequence. 

Once the clone is isolated, the two oligonucleotide primers corresponding to the 
gene of interest are used to amplify the gene from the deposited material. A polymerase 
chain reaction is carried out in 25 ^1 of reaction mixture with 0.5 ug of the DNA of the 
gene of interest. The reaction mixture is 1.5-5 mM MgCl,, 0.01% (w/v) gelatin, 20 ^M 
each of dATP, dCTP, dGTP, dTTP, 25 pmol of each primer and 0.25 Unit of Taq 
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polymerase. Thirty five cycles of PGR (denaturation at 94 ''C for 1 min; annealing at 
SS^'C for 1 min; elongation at 72°C for 1 min) are performed with the Perkin-Elmer 
Cetus automated thermal cycler. The amplified product is analyzed by agarose gel 
electrophoresis and the DNA band with expected molecular weight is excised and 
purified. The PGR product is verified to be the gene of interest by subcloning and 
sequencing the DNA product. The ends of the newly purified genes are nucleotide 
sequenced to identify full length sequences. Gomplete sequencing of full length genes is 
then performed by Exonuclease III digestion or primer walking. 

Example 3 
Screening for Galactosidase Activity 

Screening procedures for a-galactosidase protein activity may be assayed for as 
follows: 

Substrate plates were provided by a standard plating procedure. Dilute XLl - 
Blue MRP E coli host of (Stratagene Cloning Systems, La Jolla, CA) to O.D.^oo = 1 -0 
with NZY media. In 15 ml tubes, inoculate 200 diluted host cells with phage. Mix 
gently and incubate tubes at 37 °G for 15 min. Add approximately 3.5 ml LB top 
agarose (0.7%) containing ImM IPTG to each tube and pour onto all NYZ plate surface. 
Allow to cool and incubate at 37 ^G overnight. The assay plates are obtained as 
substrate p-Nitrophenyl a-galactosidase (Sigma) (200 mg/100 ml) (100 mM NaCl, 100 
mM Potassium-Phosphate) 1% (w/v) agarose. The plaques are overlayed with 
nitrocellulose and incubated at 4 ^G for 30 minutes whereupon the nitrocellulose is 
removed and overlayed onto the substrate plates. The substrate plates are then incubated 
at 70 °G for 20 minutes. 
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Example 4 

Screening of Clones for Mannanase Activity 

A solid phase screening assay was utilized as a primary screening method to test 
clones for B-mannanase activity. 

A culture solution of the Y1090-£ coli host strain (Stratagene Cloning Systems, 
La Jolla, CA) was diluted to O.D.6oo=l-0 withNZY media. The amplified library from 
Thermotoga maritima lambda gtil library was diluted in SM (phage dilution buffer): 5 
X 1 0' pfli/|al diluted 1 : 1 000 then 1 : 1 00 to 5 x 1 0" pfli/^L Then 8 \x\ of phage dilution 
(5x10- pfu/ul) was plated in 200 jil host cells. They were then incubated in 15 ml 
tubes at 37 °C for 1 5 minutes. 

Approximately 4 ml of molten, LB top agarose (0.7%) at approximately 52 X 
was added to each tube and the mixture was poured onto the surface of LB agar plates. 
Tlie agar plates were then incubated at 37 for five hours. The plates were replicated 
and induced with 10 mM IPTG-soaked Duralon-U\'^' nylon membranes (Stratagene 
Cloning Systems. La Jolla. CA) overnight. The nylon membranes and plates were 
marked with a needle to keep their orientation and the nylon membranes were then 
removed and stored at 4 °C. 

An Azo-galactomaiman overlay was applied to the LB plates containing the 
lambda plaques. The overlay contains 1% agarose, 50 mM potassium-phosphate buffer 
pH 7, 0.4% Azocarob-galactomannan. (Megazyme, Australia). The plates were 
incubated at 72 °C. The Azocarob-galactomannan treated plates were observed after 4 
hours then returned to incubation ovemight. Putative positives were identified by 
clearing zones on the Azocarob-galactomannan plates. Two positive clones were 
observed. 

The nylon membranes referred to above, which correspond to the positive clones 
were retrieved, oriented over the plate and the portions matching the locations of the 
clearing zones for positive clones wxe cut out. Phage was eluted from the membrane 
cut-out portions by soaking the individual portions in 500 jil SM (phage dilution buffer) 
and 25 ^il CHCI3. 
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Example 5 

Screcnincf of Cloaes for Mannosidase Activity 

A solid phase screening assay was utilized as a primary screening method to test 
clones for B-mannosidase activity. 

A culture solution of the Y1090-£. coli host strain (Stratagene Cloning Systems, 
La Jolla, CA) was diluted to O.D.6oo=l-0 with NZY media. The amplified library from 
AEPII la lambda gtll librao' was diluted in SM (phage dilution buffer): 5x10^ pfu/^tl 
diluted 1 : 1 000 then 1 : 1 00 to 5 x 1 0" pfu/|il. Then 8 ^1 of phage dilution 
(5x10- pfu/|il) was plated in 200 |ii host cells. They were then incubated in 15 ml 
tubes at 37 °C for 15 minutes. 

Approximately 4 ml of molten, LB top agarose (0.7%) at approximately 52 
was added to each tube and the mixture was poured onto the surface of LB agar plates. 
The agar plates were then incubated at 37 °C for five hours. The plates were replicated 
and induced with 10 mM IPTG-soaked Duralon-UV^^ nylon membranes (Stratagene 
Cloning Systems, La Jolla, CA) overnight. The nylon membranes and plates were 
marked with a needle to keep their orientation and the nylon membranes were then 
removed and stored at 4 ''C. 

A p-nitrophenyl-B-D-manno-pyranoside overlay was applied to the LB plates 
containing the lambda plaques. The overlay contains 1% agarose, 50 mM potassium- 
phosphate buffer pH 7, 0.4% p-nitrophenyl-B-D-mrnno-pyranoside. (Megazyme, 
Australia). The plates were incubated at 72 °C. The p-nitrophenyl-B-D-manno- 
pyranoside treated plates were observed after 4 hours then returned to incubation 
overnight. Putative positives were identified by clearing zones on the p-nitrophenyl-fl- 
D-manno-pyranoside plates. Two positive clones were observed. 

The nylon membranes referred to above, which correspond to the positive clones 
were retrieved, oriented over the plate and the portions matching the locations of the 
clearing zones for positive clones wre cut out. Phage was eluted from the membrane 
cut-out portions by soaking the individual portions in 500 |il SM (phage dilution buffer) 
and 25 ^1 CHCI3. 
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Example 6 
Screening for Pullulanase Activity 

Screening procedures for pullulanase protein activity may be assayed for as 
follows: 

Substrate plates were provided by a standard plating procedure. Host cells are 
diluted to O.D.^oo = 1 -0 with NZY or appropriate media. In 1 5 ml tubes, inoculate 200 
/ul diluted host cells with phage. Mix gently and incubate tubes at 37 °C for 15 min. 
Add approximately 3.5 ml LB top agarose (0.7%) is added to each tube and the mixture 
is plated, allowed to cool, and incubated at 37 X for about 28 hours. Overlays of 4.5 
mis of the following substrate are poured: 

100 ml total volume 

0.5g Red Pullulan Red (Megazyme. Australia) 

l.Og Agarose 

5mi Buffer (Tris-HCL pH 7.2 @ 75 "C) 

2ml 5MNaCl 

5ml CaCU(lOOmM) 

85ml dH,0 
Plates are cooled at room temperature, and thenm incubated at 75 °C for 2 hours. 
Positives are observed as showing substrate degradation. 

Example 7 
Screening for Endoglucanase Activity 

Screening procedures for endoglucanase protein activity may be assayed for as 
follows: 

1 . The gene library is plated onto 6 LB/GelRite/0. 1% CMC/NZY agar plates 
(-4,800 plaque forming units/plate) in E.coli host with LB agarose as top agarose. The 
plates are incubated at 37°C ovemight. 
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2. Plates are chilled at 4°C for one hour. 

3: The plates are overlayed with Duralon membranes (Stratagene) at room 
temperature for one hour and the membranes are oriented and lifted off the plates and 
stored at4°C. 

4. The top agarose layer is removed and plates are incubated at 37 °C for -3 

hours. 

5. The plate surface is rinsed with NaCl. 

6. The plate is stained with 0.1% Congo Red for 15 minutes. 

7. The plate is destained with IM NaCl. 

8. The putative positives identified on plate are isolated from the Duralon 
membrane (positives are identified by clearing zones around clones). The phage is 
eluted from the membrane by incubating in 500fil SM ^ 25|il CHCI3 to elute. 

9. [nsen DNA is subcloned into any appropriate cloning vector and 
subclones are reassayed for CMCase activity using the following protocol: 

i) Spin 1 ml overnight miniprep of clone at maximum speed for 3 

minutes. 

ii) Decant the supernatant and use it to fill "wells" that have been 
made in an LB/GelRite/0.1% CMC plate. 

iii) Incubate at 37 °C for 2 hours. 

iv) Stain with 0. 1 % Congo Red for 1 5 minutes. 

v) Destain with IM NaCl for 15 minutes. 

vi) Identify positives by clearing zone around clone. 

Numerous modifications and variations of the present invention are possible in 
light of the above teachings and, therefore, within the scope of the appended claims, the 
invention may be practiced otherwise than as particularly described. 
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WHAT IS CLAIMED IS : 

1 . An isolated polynucleotide selected from the group consisting of: 

(a) SEQIDNOS: 1-14 and 57-60; 

(b) SEQ ID NOS: 1-14 and 57-60, wherein T can also be U; 

(c) polynucleotide sequences complementary to SEQ ID NOS: 1-14 and 57- 
60; 

(d) pol>Tiucleotide sequences which encode an amino acid sequence as set 
forth in SEQ ID NOS: 15-28, and 61-64; and 

(e) fragments of (a), (b), (c) or (d) that are at least 15 consecutive bases in 
length and that will selectively hybridize to DNA which encodes a 
polypeptide of SEQ ID NOS: 15-28, and 61-64. 

2. A vector comprising a polvTiucleotide of claim 1 . 

3. A host cell containing the vector of claim 2. 

4. The method of claim 3, wherein the host cell is a eukaryotic cell. 

5. The method of claim 3, wherein the host cell is a prokar>'otic cell. 

6. A method for producing a pol>T3eptide comprising: 

(a) culturing the host cells of claim 3 ; 

(b) expressing from the host cell of claim 3 a polypeptide encoded by said 
polynucleotide; and 

(c) isolating the polypeptide. 
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7. An enzyme selected from the group consisting of: 

(a) an enz>Tne comprising an amino acid sequence set forth in SEQ ID NOS: 
:5-28 or 61-64; and 

(b) an enzjTiae which comprises at least 30 consecutive amino acid residue as 
an enzyme of (a). 

8. An enzj'me of which at least a portion is coded for by a polynucleotide of 
claim 1, and which is selected from the group consisting of: 

(a) an enzyme comprising an amino acid sequence which is at least 70% 
identical to an amino acid sequence selected from the group of amino 
acid sequences set forth in SEQ ID NOS: 15-28 or 61-64; and 

(b) an enzyme which comprises at least 30 amino acid residues to the 
enzyme of (a). 

9. A method for generating glucose from'solubie cell oligosaccharides comprising 
contacting a sample containing oligosaccharides with an effective amount of an 
enyzme selected from the group consisting of an enzyme having the amino acid 
sequence set forth in SEQ ID NOS: 15-28, 61-63 and 64 such that glucose is 
produced. 

10. The method of cliam 9, wherein the sample is selected from the group consisting 
of dairy products, fruit juices, detergents, textiles, guar gum, animal feed, plant 

biomass and waste products. 



11. The method of claim 9, wherein the oligosaccharide is selected from the group 
consisting of maltose, cellobiose, lactose, sucrose, raffmose, stachyose, 
verbascose, cellulose, starch, amylose, glycogen, disacharrides, polysacharrides 
and pullulan. 
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Figure 7a. 
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0 ^ly Tyr L.U Hit Trp Al. l«u Thr Asp asm Tyr Clu Trp 
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Figure 7b (Continued) 
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COiOT^rE OBHB SCQOEMCS - 10/95 

! sj; - - in ^, v.i m s; s c-j; s ?r - - 

61 GAT AAA CTC Arj- ar-r- . «et Ciy 

- - - ^1 £5 s K - - is a: s; - - 

K fi: ;ii jsi j:: is - - j« .j. mc „, 

lai GAG CTT TAT rjkr . Aan Tv- 

- u„ 51 s; ^ - - - s j« ^ „j ^„ 

2^1 WC ATA GAC ICC ACC ACA *r» 

" n. ^ - - - - j=. ^. ^ . „, ^„ 

am m »,i a;„i ,„ „. „, _ . -'° *"> -y.- s.r 

- - s - - - j« ^ 
- .i; j;; is sfs ;i? 51 - - - j« « ^ „ 

^ m v=II ? ^ =11 tt; ^; CCA 

^'^ ---^ ?i^e Thr Leu P-n 



eo 
:o 

120 
40 

60 

240 
80 

300 
100 

360 
120 

<20 

140 



CTT CCA TAT 7GG 



CAT GAT CCC AT- GXr r_r. " •^'^^ '''^ 

- A.P -0 ^ TTA ACT A^ 

SO CCA AGA ACA GTT ^TA TT- r^. ' ''^^ ^ 

..... ... I?; s - - 2; ™ ^ 

6C1 AIA GTS GAT ATG TCC ^r^ " 

... ... ^7, ?S - - ^ 

- - - ^ m ^. s m m s I, , ; ; :;: 

s^5^^K-:ss;s:s;£IJ«^^,~J.;;-"rl• 
=" - SI -I - m s s IK - s : :: : r : 

841 err CCT TA- r-r ... , CI y 

v.. „. - - Mc J., ^ tj; - 52 SJ - - - ^ 

901 TTC T7C CAC TCA nr^ r-^n P a. Lys Ala Ala ciu A^n Ajp 

... ... lit s; 2S K - s; =7 «T .„ 

96i CAC OCT CAA ACC TT, ^ ^ " ""e 

z z z z f - - - - - - - - - - - 

:: r r S f f ™ - - - - - - ™ - - ™ 

z z :: r S - - - - - - - - 

- ^f, $:? tr; ^ - s:: r;i s - - - - 

AA. ..C CCC CTT CCA CTT TAC CTC ACC CAC AAC OC. ... 

Tyr Val Thr Ciu Aan Giy He Ma A..p Scr 
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Figure Bcu 
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1261 AAfi GAC XTC CTA AGA 
421 lys Aap lU Ltu Arg 

1321 GAG CAT CCC TAT GAA 

iil Clu Aap Cly T/r Glu 

1381 CCT CTC CCC TTT AGA 

iSl Ai4 Leu Cly Phe Axg 

14 U ATT CCC ACG GAG AAC 

iBl lU Pro Arg Ciu Lya 

1501 AAA AAC ATT GAA GAG 

501 Lya lya lit Glu Glu 



CCT TAC 

Pro Tyr 

CTT AAC 
V«i Lya 

ATC CCC 
Met Argr 

AGC CTC 
S«r V«l 

GAA TTC 
SIu Leu 



TAC ATA CCC AGC 

Tyr ru Aij str 

CCC. TAC TTC CAC 
Cly Tyr Pht His 

TTT CCC CTC TAC 
Ph« Cly Leu Tyr 

TCG ATA TTC AGA 
Ser rU Phe Arg 

CTC AGC CCA TGA 
teu Arg Gly tnd 



Hif n « ^ "° CCC TTV 

HIS Xle Lys Met lU Clu Ly, Aia Ph^ 

^'^^ CAC TCG 

T'P Ala t,u Thx A,n Ph. Clu Trp 

CAA CTC AAC CTA ATT ACA AAC CAC ACA 
-1" V.l Leu He Thr Ly, ^ t% 

ll", ACC 

^-tu lie VAi Al4 Ajn Ajn Cly Val Thr 

1533 
5U 



440 

1360 
460 

1440 
480 

1500 
500 



J^igure 8b(ccntinued) 
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9 18 27 

5' 



36 45 54 



ATG ACA ATA CGT tTA CCG ACG CTC GCO CTC TCC CCA CCG CTG AGC CCA CTC ACC 
Hat Arg He Arg Leu AI. Thr Ala Leu Cy^ AU Ala Leu Sex So JJr 

'^^ Si 90 QQ 

TTT CCA CAT AAT CTTA ACC CTA CAA ATC GAC GCC GAC CCC CGT AAA AAA CTC ^TC 
Pho Al. ASP A.a Val Tlxr Vol Qln He A.p Al. Aap Cly Cly Ly. ^ 

117 126 

ACC CGA GCC CT^ TAC CCC ATG AAT AAC TCC AAC CCA CAA AM COT ACC Gat Inl 
S.r Arg Ala L.u Tyr cly «ec Asn Aan Ser A.n Ala Clu S S Z 

"9 198 507 -.-.c 

Aap ?rp Gin Arg Ph, Arg Asp Ala Gly V.l Ar, Met Lou 2^ ^ 

225 a3<! 343 2S2 2si 

AAC AAC ACC ACC AAA TAT AAC TGC CAA CTC CAC CTC ACC .CT CAT CCG GAT 
A.n Asn Ser. Thr ly^ Tyr A^n Trp Gin Leu Hi3 Lr. S=r Ser Hia S S 

288 297 3DS 3is 

Tyr A^n Asn v«l Tyr Ala Gly Aan Aan Aaa 7=, Aap Aan Ar, Val Ala Ju 'Al 

3S1 360 3fiQ -,,0 

s: s s s s s s S 

459 4S3 477 

TOG TOO ACC C« GTC OCT CAG AAT CTC GCT «K OCC COT GAA S AAT CTO ^ 

Trp Trp Thr Cly Vol Al^ Gin Aan Leu Ala Cly Gly Gly Glu Pr^ SIS ^ 

504 513 c-ji 

GGC GGC GCC GAA GCG CTG GTT GAA GGA GAC CCC AAT CTC TAC CTC ATX3 CAT ^ 
Cly Gly Cly Glu Ala Lou Val Clu Cly Aap Pro Aan ^™ Sp S 

"8 SS7 57S 585 

TO CCA KC CAC ACT <?re GOT AIT CTC GAG CAC TGG rn> G« OTA AAC CSa CTC 
S« Pro Ala Aap Thr Val Cly He Leu Aap Hia Trp Phe Gly JJn S 2S 

GGC CTO CGO CGT GCcIi^ OCe AAA TGO AOT Itc GAT AAC ^ CCC CCcI^ 
Gly val Arg Arg Cly Lya Ala Ly. Ty, Trp Ser Mat Aap Aan Clu So S lU 

657 $56 67^^ gg^ 

S ^ nf i^"" '^'^ J^CC CCG OTA GAA GAT TO 

Trp val Gly Thr Hi= A^p A.p Val Val Lya Glu Gin Thr Pro vll Sp ^ 

Figure 9a. 
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CTC CAC ACC TAT TTC GAA ACC GCP AAA 756 

- «. ^ « sss;^: ssisjn 

774 783 

AAA ATC ACC GCT CCC CTC CCC CCT AAT GAC Trv III 810 

^ „. s - - - - - - 

938 817 

WC ICS GTA CCC CAO CAA CAA GGC TCT Aro ACT 884 

SO. v.. pro czn c:u Z 2! J?? 1" Jjc 

CGC CJTC TCT GAA CJAG CAA CGC CCA S CCT (?5=i. 'IB 

Arg V«l scr Clu Olu am Ar. 5^ ^ ^ ^ f ^ CTC CAT 

«t v,iy vai Arg Lou Leu aj?p Val Leu Aop 

535 ..^ 
CTO CAC TAC TAC CCC (W: CCT TAC AAT fvv, ri. ^" 

.1. ^ s ^= s 2,' s s s si 

. ACC TOC TTC CAC CGC GAC G~" rcl «r r^^ -Ji ^""^ "26 

... p.. ... S IS S 2; S 2; 0=5 

1035 1044 10S3 

OAA (KT GGC TCM GAT GAC ACC ATC JlAT AA,- ri.*i« 1080 

1098 1107 ,,,, 

CAT TCG CTC CAC cyu TAT ATO aaj CCA GAC CA" «^ r^. 

-.T...UC1.«.^«.,,,,------2:^^TTA^^ 

GAA ATO TOC OTQ CCC^AAT GTQ AAT^JS ATr nr^^^I! "88 

1197 1206 1215 

ATtS CTC GOC ACC TIC CCC CAT AAC COC CTC n:^ lit ^ 

Het^C..T^p.eA...p^„%^,=S--^S^-J^^ 

^^^^ 1360 1269 151(1 

AAC ACC GSA ATO TGG GAA ACA CTC CAC nry- ^ nZ! ^^^'^ ^296 

- ^ «p s s s! s 2; s: - 

1305 I3i(i ^323 

CCC CTC GCC TCC AGC TCC AGT CTT CAA CLAT ^ ^^^^ ^^50 

^ v.. .„ ^ - s - - - s.' s Ji: 
sj'i s: 2S 2 s si r ™ - - - 

Aiip Ala Hat -Rir Vai Lau Lau Val Asn Arg Sec Thr Ser Clu 

Figure 9b(Continued) 
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DQUlrin gooldi oaaofflnoaaQOo (37091) (coac4a«ofi) 
i<^" 1«22 1431 1440 



ACC CAC ACC CCC ACT ore GCT ATC OAC CAT TTC CCA 

... ,^ V.I XI. 2: s ^ s .^s s.^ 

1<1«S liU icn, 

ACC CTG CGC TTA CAC AAC Cro ceo GS! CM GAA ACC 

^ A„ ^. ^ S; J£ ™ ^ ^ 

1531 1S30 1539 ^rv,- 

AAC GCC CTO GAA AAA COT ACA CTG CGC nrr ^^^"^ ^^^^ 

„j s 5S s s s 2j J2 s - - - 

^575 15fl<i 1593 ifina 

rro CCC CCT CTC TCC GTT ACT GCA ATA rro CTC AAG GCC CCG^'r^ 
Leu Pro-Pro ser Val Tto Ala Ilo Lau teu Ly. S! So 



Figure 9a (Continued) 
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cnx: ATc To^ crc c*. ..f.^ vrc c«a acc rrc ac^ cAn oca nc cm- c?^ 
ne cy. vai clu rjc m,= ciy .y, ^ -~ - - - - 



" 72 81 ,0 



loa 

TCC 



ri?' ^ «™ <^ AAc Ao^v err coc 

Lys Glu Ly3 Asn Phc nvr Vai GiC v^l" Cl'u' 'ill 'hLI ^i;; 

117 135 

.rf !!! ^ ^ ccc GGA err CAfi ^ coA 

ixe ser Gly Art, vai ci; ^ 1;^ vii z^j i;!; 
^ ™ § !!! ^ ^ ™ ^ 

Lys AOa Pro OXu Ly, V.l vll c'J, sir ^ 'gXy '^2 'c^ L'," 

225 234 2i2 "^^^ -in 

v«i vai Asp z;." p.^ Clu III ^ 

n>r n» Ser Vdl Val Pro Asp vll cIu ^ iJIi; clil 

™^^?^??^?^^3^fff"''^^'«:>AAA"cCRCATOT 
Val Ala Clu Clu Cly Lys Val T/T Cly Phe sl^ 'sZ iTe Zl sLl 7^ 

3S6 405 414 423 

Phe Phe Ala Val Glu Asp Cly Clu vll Zl ^ 'l^ ^ ^ Z^ yll 

m 450 4gg 

Clu Pte A.,p Asp Phe Val Pro 'loI Clu ^ro iZ vll Oil clu Z^ Pro ZliJ 
<9i> 504 513 S22 ?1i 

ACA CCC C-nj CriJ CTC GAG AAA TAC GCX: CAA CTC CTC CC^V A2t: AAC 

THr Pro Leu I^u L,u Clu Lys Vyr All Clu Leu vH Cly Zl all Z^ Z'n Zl 

549 558 567 57J 505 ... 

^ ^ T« *CC TO TAC CAT TAC rrc CIT 

Arg val Pro Ly. ,Uu -Mu: I'lo ll.r <;ly i-^ Zl 'ry^ l-ul IZ 



Figure lOc- 
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-nitftmototja m^iritima Alpha-QAJacLosidajoo 

^. T^. ^ ^. '^rc A*C AAC CTC CIX: OCG TO 

Asp nir Trt. Clu Clu Tlu Lys ^'n Zl iy^ Zn 'p^ 'fH 

!3! ^? !!f f^? i^T* ^ ^5 ^ CAA AAC CAC ^T^ ^ (uc TCc ^ 
Pho Clu vai Pho ain He '>Z '^P ^ Giu nl cly li; 

711 720 729 733 747 

ore ACA AGA GGAGACmcCATOOTCAAGACA3T?a^ 

Val Ttr Arg Gly Asp Phe Pro S«r Vai Clu Glu Jtet All vll ile All clu 

^f^^^^CraS^ATATXXJ^CCCCCG^ACTCT 

Asn Gly Phe He Pro Cly lie Trp Ibr Ala Pro Pha S^r vll sl^ clu 7^ 

819 B28 S37 84S 855 B64 

Axp Val Phe Asn Glu His Pro Arp Trp Vai Val Lys GlG A^ Gl^ Glu P^o Lys 
882 831 900 909 qir 

Ket Ala Tyr Arg Aan Trp Affn Lyo Lyo lie Tyr Ala As^ ^ £^ 

527 936 945 954 963 * 972 

^^^AACTO3CTrriCGATCTCTOTCATCTCTOA^ 

Clu Vai Leu Asn Trp Phe Asp Leu Phe Ser S«r Leu Arg Lys Met Gly Ty^ 

981 990 999 1008 1017 1026 

AGGTACTrCAACATCGACTTTCrcriCCCCCCrrGCCGCT 

Arg Tyr Phe Lys He Asp Phe Leu Pha Ala Gly Ala Vai Pro Gly Glu Arg Lya 

Air. . "i! ^°S3 1062 1071 1080 

AAGAEICATAACACC^ATTCACCCCTOACAAAACCGATTGACAC^ 

Lye llo Thr Pro lie Gin Ala Phe Arg Lys Giy He Glu Thr Ho Arg Lye 

1^^°" iiiS 1125 1134 

OCG GTG CGA GAA GAT TCTTTCATCCTCCGATOCCGCTCTCCC CTT CTT CCC CCA 

Ala Vol Gly Glu Asp Ger Phe Ile Leu Gly Cya Gly S«x Pro Leu Leu Pro All 

1143 ns2 U61 ino 1179 1188 

TO OCA TCC Crix: GAC CCC ATG AOQ ATA OGA OCT GAC ACT CO; CCG to: 

Vai Gly Cyx Vnl Ajp cly Mfit Arg lie Cly Pro Aap Tlir Ala Pro Phe Trp Cly 

Figure 10^ (Continued) 
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I . ■ •! 

^^^"^ ^206 UX5 1224 

CAA «T ATA CM CAc AAc «:a cx:t c^'^ ACA 'r^'^oc^ CTO ;^*L:.ccc 

Glu His n. Ciu A.P Asn Cly Aia Pro Ma" M." 'f^ 

1250 1269 n-TO 

!:f! ^ !!^f ^= <^ ™: TCC^C^ AAC «c'^ 

ne Ohr AT, ^ PH, "^"^ ^ Z^l ^ - 

1305 1314 1131 

f!? ^f!^ ^ ^ CAT ACA CAc'^ «U eiC ac^° 

He clu Giu Lys ^ A*p i];: ^ii; ci^ i;; i^'r^;^ 

1353 i3fia J,— 

^f? !!! f?^ ^ ^ A« oat'^ 

^ ^ Cy. Cly uu A^ A.^ ^e'^ ne' 'iZ IZ- ^ 

^. ^ ^ -^'^ om';!^' c^'t^ arc c^t'^ 

V.^ AXV A^ His Cly Ly, r,^, val IZ Zyl cIu ^ I^" clu Z^I 

^ ^'f? f!! ^ - -™ - «T^§^ ACA ^^I^ AO. cmr^T^ 

^ Pro AT^ vai =i« XI, scr cTu" i:;; i-^i L:; i;; vH ^ 

1521 1530 -ic-ifl 

^ f!! - - OTC^I^ C^^^ ^ ^15^. 

Ser Gly Ito I^u Sar Gly Aan Vol Ly, He vZI vll ~: :~ Z" — 

. . ^ -g 

■O.^S^ CAA AAA^^ GOA AAC^^ ^ c^^^ ^ ^^^^ ^ 



TVr His U« Giu Ly, Clu c'w L^l s'^ 'Z l'^ l"^ ^ ^ ^ - 
Clu Asp Gly Arg Asn Phe Tyr Phi ciC olu gI; gIu cIG 



Figure 10c (Continued) 
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9 18 27 36 45 54 

5' ATG GGG ATT GGT GGC GAC GAC TCC TGG AGC CCG TCA CTA TCG GCG GAA TTC CTT 

Mec Gly He Gly Gly Asp Asp Ser Trp Sex Pro Sar Vai Ser Ala Glu Phe Leu 

€3 72 Bl 90 39 lOB 

TTA TTG ATC GTT GAG CTC TCT TTC GTT CTC TTT GCA AtTT GAC GAG TTC CTG AAA 

Leu Leu Il« Val Glu Lau Ser Ph© Val Lau Phe Ala Ser Acp Glu Phe Val Lys 

117 126 135 144 153 162 

GTG GAA AAC GGA AAA TTC OCT CTG AAC OSA AAA GAA TTC AGA TTC ATT GGA AGC 

Val Glu Asa Gly Lys Phe Ala Lau Asn Gly Lys Glu Ph© Arg Ph« lie Gly Ser 

171 IBO 189 198 207 216 

AAC AAC TAC TAC ATG CAC TAC AAG AGC AAC GGA ATG ATA GAC ACT GTT CTG GAG 

.Asn Asn Tyr Tyr Met His Tyr Lys Ser Asn Gly Mat Ila Asp Sar Val Leu Glu 

225 234 243 252 261 270 

AGT GCC AGA GAC ATO GGT ATA AAG GTC CTC AGA ATC TOG GGT TTC CTC GAC GGG 

Ser Ala Arg Asp MeC Gly lie Lys Val Leu Arg lie. Trp Gly Phe Leu Asp Gly 

279 288 297 306 315 324 

GAG AGT TAC TGC AGA GAC AAG AAC ACC TAC ATG CAT CCT GAG CCC GGT GTT TTC 

Glu Scr Tyx Cys Arg Asp Lys Aaa Thr Tyr Met His Pro Glu Pro Gly Val Phe 

333 342 351 360 369 37S 

GGG CTG CCA GAA GGA ATA TCG AAC GCC CAG AGC GGT TTC GAA AGA CTC GAC TAC 

Gly Val Pro Glu Gly IIq Ser Asn Ala Gin Ser Gly Pha Glu Arg Lau Asp Tyr 

387 396 405 414 423 432 

ACA GTT AAA GCG AAA GAA CTC GGT ATA AAA CTT GTC ATT GTT CTT GTG AAC 

Thr Val Ala Lys Ala Lys Glu L«u Gly He Lys Leu Val He Val Leu Val Asn 

441 450 459 468 477 486 

AAC TGG GAC CAC TTC GGT GGA ATG AAC CAG TAC GTG AGG TGG TTT GGA GGA ACC 

Asn Trp Asp Asp Phe Gly Gly Mec Asn Gin Tyr Vol Arg Trp Phe Gly Gly Thr 

495 504 513 522 531 540 

CAT CAC GAC GAT TTC TAC ACA GAT CAC AAG ATC AAA GAA GAG TAC AAA AAG TAC 

His His Asp Asp Phe Tyr Arg Asp Glu Lys He Lys Glu Glu Tyr Lys Cys Tyr 

Figure llcu 
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Arg Glu 
546 



5" 558 567 575 585 594 

?!! !!! !!! '^'^ '^'^^ ccr tac agg all 

Val ser Phe Leu V«a Asn His V«l Asn Thr Tyr Thr Gly vll Pro T^r 
603 612 621 630 639 

!!! ™ *!! ?^ ^ ^^"^ CCC CCC TCT Ca^C ACG GAC 

Glu Pro Thr lie Met Ala Trp Glu Uu III III Glu Pro L"g 'c^l Gl'u Thr 

«6 675 684 693 

^ !!! !^ !^ ™ !!!!!! =^ "° tIc tac ata ]^ 

Lyu Ser Gly A^.n Thr Lou Val Glu Trp Val G^u Met Ser Ser ^ Ue 
730 729 738 747 

-?!!!! !!! °" ^^'^ -iTC -iTC agc 21 

ser I^u Asp Pro Asn His Leu Val Ma Val Gly Up ciu Gly Phe Phe slL Zl 
'■'^ 783 792 80' 

^ !!! ^ ^ =^ ta^ aac ggc 1^, 

Tyr Clu Gly Phe Ly» Pro Tyr Gly Gly cIu III gIu III Ii; 

828 837 846 asS bca 

™ !!!!!!!!!!!!! "° ^^'^ ^^'^ *^ <=Ac TO GGC See 

Ser Gly Val Asp Trp Lya Lys Leu Leu Ser He Glu Th^ vll Up p^ Gly tU 

_ "3 882 B91 900 909 9,0 

!!! !!! ^"^^ ^CC CAC TGG CGT GTC act CCA GAC AAC TAT GCC CAG TGC 
Phe His LOU Tyr Pro Ser Hi, Trp Gly Val Ser Pro Glu Iln ^ HI IZ T^ 

"7 936 945 954 963 „, 

^- !!! ^! !!! t!* ^^"^ AAG ATC GCA AAA gag ATC GGA AAA CCC 

Oly Ala Lys Trp lie Glu Asp His lie Lys lie Ala Zyl clu III Gly L^ Pro 

S90 999 1008 ioi7 losfi 

!^ !^ '''' "'^ '^'^ «^ aga acc 

Val Val Leu Glu Glu Tyr Gly lie Pro Lys Ser Ala Pro vll Ijg Thr II 
1044 1053 1062 1071 1080 

C!! "^"^ """^ "'^ "^^^ cat crc cgt gga gat gga ccg atc 

lie Tyr Axg Uau Trp Aan Asp Leu Val Tyr Asp Leu Gly G^y Up ciy III Mat 

Figure lib ( Continued) 
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^'^^Sl (eoatlaood) \C (r 

- - - - - - £ 

^^^^ 1152 liri 

TAT CCG GAC WC GAC GGT _ ^^^^"0 i^^j 

- «7 . ■ - -- °~ ^ - 

- - - - - - - ». ;;;; ~ ^ 

^^^^ 1360 i-iffq ^ 

- - - - - - ^. - -~ -~ -~ 

-305 1J14. 

GTG AOS OCT K?T CTT TTC GAC TAC ACC AAr I'SO 
. AAC ACG TTT GAA lan m-*- -"--OJ 

Val Ar« »r _ TTG TTT CTC AAA 

0., V. ^ --- .-- ^. ... ... 

E F - - "-'^ --- - - - - "c 

v., ^- ^. ... ... ... 

" " " °" - 

„„ : A« BIO OTC 

=lu „, „„ ^ ^ 

''5^ Lys Val val 

1S21 1530 

AAC cau CCA CGG TAC CSTC CTC GCA Sc CA> r^"*' ^557 isg, 

--- - ^ ^ cat n, ^ 

A. A.. --- --- ... ... ... 

^ 1575 1584 leg, 

^ ^! !!!!!! '^'^ ---- cAo^^ cAc r,.^^^ ^ , 

v«i , ■-- — «SG TCA CC7 GAC 

Lys Aan Trp Trp A«n s«r Gly Thr Trp li'I I" " 

^^'^ '^^^ P'^'' Gly Ser Pro Asp 

Figure llO(Continued) 
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*».o«.oto^o narltina p.nanna,coo ^ (continued, (I Q ^ 
1629 1638 1647 ..^^ 

-!T !^ !!! ^! ^ !!! - -J - -A^SS cnx^'Sl cn. ^^S^ 



xie ci'^ z^u" ii^ vii" - 

Pro OXy Lys S« A.p Trp GIu CIu Val Ar^ vll Z I'yl V^l ^ ^ ^ 
1737 1746 ,^55 

Tc. ^ «o ere cac r.c ,.c"S cca aac'S oac 

s.r oiu cy. Giu iio Qiu L'^ lie" 'r^ HI lH L'^ 'vH li: ;i; 

1791 . 1800 1809 laifl 

AAc c« ^ ^ ^ ecc TAc Gcc cn, ox, Mc ccc occ .r.'S^ 

Oly Arg Leu Arg Pro ,Vx Ala' ;;:i' HI HI ^ ^ ^ ^ ^ 

^^^^ 1854 1863 iq7!J 

!^ ^ ^! !!! ^ ^cr cc'Sc ^ 
U.U M« A.n A.„ Ala Val Gi"u' s'^ ^il 'oil ""^ --; - 

1917 ,,,, 

AAA CAC TAC ACA ACA TTC CAT GTA AO. ATT OAc':^ OAC AOa'I^ cCC 030^^ 
Lys Glu Tyr Arg Arg Phc Ki« Val III al'J ^ ^ 

^^^^ 1962 1971 iqftrt 

AAA GAA CTT CAC ATA CCA err CC. CAT CAT^JJJ AGO TAC^^J CCA CCG^J^ 

Ly» GIU I.U His II« Cly val Val Gly Us l'^ 'r^ HI HI 

2007 2016 2025 

TTC ATC GAT AAT GTG AOA CTT TAT AAA AGA AC^'S^ OGT KTo'"^ , . 
Phe lis ABp Asn Val Arg Lau Lye Zl 'tZ 11^ Hy III 



Figure lid (Continued) 
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^11 la P-^a«.ool6.oo (^3^3^!) 



5 18 



»f - „c £ CO, J^cc oxc ^ ^ „ 

- - - - - - - - i.- ;i ~ ^ F - 

126 lie 

^: ~ -~ --- ^ „. ... 

X.. ^ ^. --- ... „. ^, 



=» S «c J^J ^ „ - „^ - 

- ^ ^ ... .1; ~ -,; ■-- --- ... ^ 

- - - V. ^ -~ ~ ^ ^ ... ... ... ^ 

- !!! § - - - «c 

- - - - ^ ^ ... z; z; z: z: :~ 
- - ^. -z - ~ ^ --- ... 

«41 

^" — 0;; 

Figure 120- 
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2^X1 Xa P-naaaooifiQoo (tf30Bl) (coatiauod) 

558 567 576 595 

AGO ACA CTT CTT GAC TTT GCC AAG TAT GCT GCT TAC ATC GCC CAT GCG CTC CCA 

Arg Thr Val Val Glu Phe Ala Lys Tyr Ala Ala I'll HI His HI Leu Gly 

"1 630 639 <..p 

!!! f ^ aac caa ccr atg gta crrr (m; gag ck 

A3P Leu Val Asp Thr Trp Ser Thr Phe Asn Glu Pro Met vll vll Val Glu 'l2'u 
657 6fi6 675 684 693 703 

^ ™ !!!!!! ^"""^ ^ ^ ^ CCC GAG G?C 

Gly Tyr Lou Ala Pro Tyr Ser Gly Phe Pro Pro Gly Val Met Zn Pro Glu All 

711 720 729 738 747 75^ 

GCG AAC CTG GCG ATC CTC AAC ATG ATA AAC GCC CAC GCC TTG GCA TAT AAG ATG 

Ala Lys Leu Ala Ilo Leu Asn Met Ilo Asn Ala His Ala Leu All ^ lyl Met 

765 774 783 792 801 810 

ATA AAG AGG TTC GAC ACC AAG AAC GCC QXT GAC GAT AGC AAG TCC CCT GCG GAC 

He Lys Arg Phe Thr Lys Lyo Ala Asp Glu Asp Ser Lys Ser Pro All Zv 

815 828 837 846 855 864 

GTT GGC ATA ATT TAC AAC AAC ATC CGT CTT GCC TAC CCT AAA GAC CCT AAC GAT 

Val Gly lie He Tyr Asn Asn Ilo Gly Val Ala Tyr Pro Lya A^p Pro Zl 

873 882 891 900 909 919 

CCC AAG GAC GTT AAA GCA GCC GAA AAC GAC AAC TAC TTC CAC ACC GGA CTG TTC 

Pro Lys Asp Val Lys Ala Ala Glu Asn Asp Asn Tyr Phe His Ser Gly L^u Phe 

527 935 945 954 963 972 

TTT GAT GCC ATC CAC AAG GGT AAG CTC AAC ATA GAC TTC GAC GGC GAA AAC TTT 
Phe Asp Ala He His Lys Gly Lys Leu Asn He Glu Phe Asp Gly Glu A^n Phe 

"0 999 1008 1017 1026 

GTA AAA GTT AGA CAC CTA AAA GCC AAT GAC TGG ATA CCC CTC AAC TAC TAC ACC 

Val Lys val Arg His Leu Lys Gly Asn Asp Trp Ilo Gly Leu Asn T^ 

1035 1044 1053 1062 1071 1080 

CGC CAC GTT GTT AGA TAT TCG GAG CCC AAO TTC CCA ACT ATA CCC CTC ATA TCC 

Arg Glu Val Val Arg Tyr Ser Glu Pro Lys Phe Pro Ser He P^o III lie III 

Figure 12b(Continued) 
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1098 Mft-, 

~ r ^ ^ ccc ^r^ ^ ^-n. 

GAT CCC ATG CCC GTC ACT run^ " ^^"^^ 1,75 

:-- :- -- --^ -"^ ^! ^ ^ cL' 

A3P Cly Met Pro Val Sar A«« ti 

ru ci, 01. V.X ^ „^ - 

' - - - - - - ^- ~ -~ -~ ^- ^ 

-7 !!T !!! - -'1° - cc'S 

v.. ^ ^ ~ --- ... ... 2! 

^^^^ " 1314 

A.. ... c,. ^ ^ ^^^^ ^ ... _ 

:.u ^ ^ --- 

^^^^ 1"2 

- ^.c ^0 ^ .3. _^ J ^ ^ 

- - - - n-; .-^ - - - 

CAG Ar.":; CCC .oc"t^ c^ cc'iS: mc cgt'^ cct 

" AAG GAT ATC AAA GAG 

Glu Us Tyr Arg Arg Ilo Val n„ ^ . — 

V«X can sex A.„ Gly Val p„ „^ 

1530 1535 
^ GCT GAG GAG AAA S 3- 

Giu Phe i^u Lys Gly Glu Glu Lyo '""0 



Figure 12C(Continued) 
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OCi/av EafioglBcaiiQDo {33Bpi, 
' 18 27 



3- l-------..rc^a,.„^^,,^^^^^^^ 

«ec va: 0. .3 P. v:: z - - 
- ne s« se. oT; z;: - - --- 

^^"^ 126 235 i^>t 

~ ™ f^! .T^ ?!! !r — - ~ =0 S a« 
H« =i« =>» s„ V.1 s„ ^; i.-; ;~ - ~- ~ ~ --- 

^"^1 180 

AAA .™ ^ ^„ ^ ^ ^ ^ 

v.. c.. ..e ci; HI oi: ;u-: -i: 

225 234 -„ 

^ !!!!!! ^" ^ CA. ATA ;^ 

va. a., ne «u a^ i^i ii: ;i: ;i: - - - 

CIX P.e A.P se. val A.. xXe P„ n« " ^ ™ ~- ^ -~ --- --- 

--"!!^!?!!::!i-=^^---cTcaAAj^;=„«,^j^^^3 

..p A.P p,, "i; - -:- --- --- 

AOC C« CAC AA. - A^ ^ 

A.. AX. c=xu A.„ ^„ ^„ ^ - --- --. ... 

450 459 

™ ..T C« C« ceo =„ ^ ^ „ ^ ^ ^ .7. ^ .„ 

^ «n ..p ^ - ~ -~ -~ ... ... 

504 

^ !!!!:: ^ ™ «: 1^ »^ „c ^? 

- p.. ^ - - - ~ --- ... ^. 

Figure 130- 



wo 98/24799 



28/46 



PCT/US97/22623 



549*""''^ S^/''«*---° (continued, 

OXu P.O Ma can ..-J ^ ™ ^ --- --- ... ... 

603 ^ 

„, CC .CC C== S ™ - 

u. ... V.1 n. ^, ~ - ~ --- --- ... 

666 675 

Tn, 2 ry. ~ ~; ~ i;i ~ ^. ... 

™ i'! i !!! ^ ^ ^ a. =„ S 

II. II. V.I s„ ^ „ - ~ --. ... 

=« TCC ^ CCC S C ™ ^ ^ ^ ^ =^0 

01- -r^ - - -~ ~ ~ --- --- ... 

!^ *!! ^ ^ *!! S „c ^ ^ ^ S 

=iu n. ... =1, n. ^, ... - ~; ~ ~ --- --- 

873 882 

^ c„ „ a, ^ „ ^ - ^ JO, ^ ^ 
V.I ^„ „^ - - ~ -.- ^- ... ... 

I!!^ !!!! !!T ~° -~ ^ a:, ccc ^ c;;. ^ 

se. vaa ^ ^ - - - ~- --- --- --- .-. 

-3 ^ f ccc ^ ^ ?^ ^ ^^^^^^^ 

ser ^ ^3 --; --- --- --- --- ^ 

1035 1044 1053 incr> 

3=3 c„ „c ^ «^ ec. ™ CC. ,=ct'^ ^ 

=in ... ^. ^„ „. --- ... --. 



TAA 3 



Pigure 13b (Continued) 
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Mec ..p .eu V. x.e n. .'I ^ " ^ ^ 

va. ... - ~- ... ... ^ 

n» CI. ax. v«a ox. no oIC ^ ^ ;Z Z '^l 

ne ..e P.e CXn .a. se^ ^I." i!;; ^I," ^ ^ ^ 
234 

::::!! ^ ex. IS ^ - 

v.. - ~ -- ~ --- ... ... ... 

^IT !!! S ccc i= ^ ^ - 

V.X s„ „, - ~ -~ ~ -~ ^ ... ^ ^ 

342 

v.. ^, n. ^ - ~ ~- ~ ~ ^ ^. 

387 396 

~ ^ ™ ?! - - - cc =^ - - 

v.. ^. n. ^ ~ ~ - ~ ~- ~- --- --. 

^50 

^ ^= - «t ^ ^ ^ «J ^„ S «c 1^ 

^ ^ - -~ -~ --- -.- ... ... 

™ !!! ^ S or. ^ „ ^ 

... p.. „, V.I ^ „, - ~ --- ^- ... 

Figure I4rfi— 
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tBOl>3) (coatinuod) 
^ MC CCA CM CAC ^ CAA c« "c CAC <^ 59< 

o7; 01": --- --- - - - - - ^ 1" °- aac 

A.P Thr Clu Pro TVr Gin Val Val As„ 

AAC GGC ore 'KM GAA OCG r-rm «30 £39 

^ SAT CTC CAC GGA ^ ™^ 

c:y val .r, Clu ^Ix " ^ 

' - 
TAT CAC CTG CAA AAC TAC CCA a.^ f'^ 693 
.t. ^ ^« A« AGA ACA ACC GTC GAT CCT TAT ^ 

^ l.ys Il» Arg Thr Thr Val Asn ,s 
711 Asp iTo Tyr Ser Lys 

CCC CTT TAC CCA AAC AAC CAA CAO aS GCC <^ Z^! 756 

OCC CTT CTO AAT CTT re^v , " 

Ala Val Tyx; Ala I " "T GCC AGG ACA AAC 

Z " - - - - 

;^ P ^ ^ - - « — - =^ c - ^ 

ti^_ »^ T„ =^ « 22 ^ ^ 

— — ^=r; 

Lyb Aan Lya ciy Lau Tv^ r ^CC GGC 

„, ^ ^ - «•» - lii z =1; 

V.I ^ ^ ~ --- -- !!! « -3 « 

° «" "-i 

n. Pro n. " - «^ ^ TTC 0.= 

~ - ™ - ^« c.'- ^ .... 

Lv*3 Ts^ ^ GCC AGA 

^ys Tyr Tyr Asn Trp Glv TSr^ a ^ _^ 

' ^ '"^ "° ^ vai ^i: 

Figure 14b(Continued) 
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1089 1098 nm 

-"f - - - « - - - ^ 

- - - - - ;i £ 



v.. ^ --; ~ --. ... ... .J 



12" aj« 



- *_=c «c ^ „ ^ ^ ^ ^ 

«. .-^>vx ~- ;~ ~ --. ... ... „. ^ 

- TV. ^. zu ^ ~ ~ -~ -■- --- --- ^. ... 



1305 1214 115-1 

=3. .3c =0. ^ ^ - - c.-; ^ 

V.1 n. s„ ~ ~ --- --- ... ... 

TAC T>GG GTA AAC GAG TAT CAC ATA GAC CCA T-Xt'I^ '^O^ 
- ^ ^ CCA TTC AGG TTC GAT CAG ATXJ GGT CTC 

Tyr Trp Val Lyo Clu Tvr Hisi ti„ > ^7 

^ H.. II. --- 

^^1^ 1422 ij-i-i 

- a./- 

n. ^ ^ ,J ~ ~. ... ... ... ... ... 

1^^' 1476 ijiQc 

x>.^„. ^ ^ ^ ~ --- ~ ... ... ^. ... „. .„ 

^ c^c'S? =0= ^ „ ^i"; 

^ V.X «. o., ~ ~ ~ ~ ;;;; ... ... 

M. n. .„ ~ --- --- -.. ... ... ... 

Figure 14C( Continued) 
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..3. 

«v ,« c„ - -;- - - !r f?^ 

oix L„ „. ~- -- - ^f! t.c 

M. cy. ^ ~- -■ ~ - !^ fl! !f! ^ 

£ ^- r - ~ ^ ^ -"^ ~ - - -'S 

"° ^" - - - - - F 

• • ACT TCT CM car GW CCT TTC C-v rf^ . = 

u fcou Thr Ser Gla aiy Vai Pro Ph» t 
1899 :go8 ' Cly Cly Gin 

-c _^ ^ 

^" Asp Aan Ser Tyr Asn I 

1953 i5j2 " '^^ Sar 

ATA AAC GGC rrc GAT TAC ^''^ 1*80 

:.! A=A AAA err GAG rrc ata gao 1^ 

Xle Asn Cly Pha Asp T - f^^f ^ AAT TAC 

"-^-^!!!^!^----=aacac^??;gct^^-^ - 
- - - .3 - - - ^ - - - 

Ala Glu Clu lie "I,- ^ ^ !!! AOA ATA G^ 
2115 

---n!^!?!!!--CACGCA^-«,^-^ 2151 ,,,, 

Ala Phm M^r , , ^ GAC ATC GTXI GTG 

.eu .y, Ala ci, Gi. ^„p - - - - 

^ A-yo Afip iia v^jL Val 

figure uatcontinued) 



wo 98/24799 



33/A6 



PCT/US97/22623 



-o«3,.o.a ^^^^^^ ^^^^^^^^^^^ 

^^^^ 217fl 5ia7 

- - =« MC ™ o« « « ^ ^^^^ ^ 

X- ^. ^ ~ ■;- -■- . 

2323 2233 59^1 

- =™ =^ ^ ^ e« ^ ^; ^ ^ 

- v.. v„ v.. ,„ „; ~- --- ... ... ... ... 

2377 22B6 350c 

^ - ™ «c^^ ^ ^ „ 

=>..T^ C.U u„ ^ ^- - ~ -~ ~ ~. ... ... 



Figure 14e( Continued) 
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* Glyccsidase 

CTT TTA TTG ATC GTT GAG CTC TCT TTC ^rr r 

Aia Ser Asp gXu Ptt 

GTG AAA GTG GAA AAC GGA AAA TTC r.. 

- -y v.: ... z ™ c :r r °" ™ ™ 

Asn Gly Lys g:u Phe Arg Phe 

ATT GGA AGC AAC 

^ Leu Arg Hg Xrp 

- ™ ™ z ™ « - - 

V- Cys A.S .sp Lys Asr. Thr T-.r Met His 
CCr GAG CCC GGT GTT "-c r-n 

- - =t ™ - - ^--^ ^ 

Lys Ala Lys Glu Leu Gly 

AAA CTT GTC ATT GT"" rrr , 

- ... a. - - - - - ^ „^ 

Trp »,p ^.^ 

- ™ :° z 2 2 z - 2 - - - ™ «c 

^ i^-^r His His AsD Asn dh^ 

P Asp Phe Tyr Arg Asp 

--=-':=t:":;\=-::!----^c ...... 

' val Asn His 

Thr He Met Ala 
^ ^/s Ser Gly Asn Thr 
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- z 2 z z ^: 2 z r r 

»er Tyr He Lys Ser Leu Asp Pro 

AAC CAC CTC GTG GCT GTG GGG GAC GAA GGA TTC TTC Aac ..r . 

- ».= V.X ™ ,J - - - - 



TTC AAA CCT TAC GGT GGA GAA GCC GAG 
Phe Lys Pro Tyr Gly Gly Glu Ala Glu 



TGG GCC TAC AAC GGC TGG TCC GGT 
Trp Ala Tyr Asn Gly Trp Ser Gly 



GTT GAC TGG AAG AAG CTC CTT rrr its r-^r, 

V- ... ... z :z n. 2 z r ^ "° 

Tar Val Asp Phe Gly Thr Phe 



CAC CTC TAT CCG TCC CAC TGG 

His Leu Tyr Pro Ser His Trp Gly Val Ser p'r'o" IZ Zn 



GGT GTC ACT CCA GAG AAC TAT GCC CAG TGG 

Tyr Ala Gin Trp 



GGA GCG AAG TGG ATA GAA GAC CAC ATA AAr r-r-. 

Gly Ala t^, ^r^ " AAA GAG ATC GG 

/ Ala Trp lie G.u Asp His He Lys He Ala Lys Glu lie Gl 



GGA AAA 
y Lys 



CCC GTT Gr: CTG GAA GAA TAT GGA A-^ CCA T.zr n^^ 

- val .1 Leu Glu Glu Tyr Gly Ue ^ ^ ^ ^ ^ 

x^: z: r --^ - - - - 

Tyr Arg .e. .rp Asn Asp Leu Val Tyr Asp Leu Gly Gly Asp 

Phe Trp Met Leu Ala Gly He GTy Glu Gly Ser Asp Arg Asp 

GAG AGA GGG TAC TAT CCG GAC TAC GAC GGT TTC AGA ATA GTG r. 
Glu Arg Glv Tvr Tvt Ov^ » °AC GAC 

Gly Tyr Tyr Pro Asp Tyr Asp Gly Phe Arg He Val Asn Asp Asp 

b wiu lyr Aia Lys Leu Phe Asn Thr Gly 



ct ::i 2 T G? - 

Asp He Arg Glu Asp Thr Cys Ser Phe He Leu 



GGC ATG 



Pro Lys Asp Gly Met 



GAG ATC AAA AAG ACC GTG GAA GTG AGG GCT GGT GTT TTC GAC TAC 



AGC AAC 



Figure 15b (continued) 
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y vai Phe Asp Tyr Ser Aen 

ACG TTT GAA AAG TTG TCT GTC AAA GTr r.. . 

"^tr::::-%t--:;.'::;^"==^c ...... 

^ Pile Asp Leu Asp Thr Thr Arg 
ATC CCG GAT GGA GAA CAT GAA ATG TTC CT- gaa rrn 

- - ..P 0.. - - - C.0 

ACG GTG AAA GAC TCT ATC AAA GCG AAA GTG ^-rr . 

- - - ... - - ^ - - 

CTC GCA GAG GAA GTT OAT TTT TCC TCT ro» ... 

- - - V. - 2 z z: z 

AAC AGC GGA ACr '^rr ^« 

-A AC. .GG .A. GCA GAG TTC GGG TCA C"- Gi- 

Asn Ser Gly Thr Trp Gin Ala G^u -h. r- AAC 

rtxa i'he Qiy se^ ^ 

y Se. ..o Asp He gIu Trp Asn 

=GT GAG GTG GGA AAT GGA GCA CTG CAG CTG AAC -G A.^ . 
Oly Glu Val Gly Asn Gly Ala Leu Gl r ^ "° '^'^^ AAG 

y Leu Gin Leu Asn Val Lys Leu Pro Gly Lys 

AGC GAC TGO GAA GAA GTG AGA GTA GCA AGG AA 

.sp Trp Glu Glu Val Ar. Val Al" L ^ 2 T 

a ^ys ...e Glu Arg Leu Ser Glu 

TGT GAG ATC CTC GAG TAC GAC ATC TAC ATT CCA A.r . 

~ - r r - - - 

Tyt AU V.1 

=TC OAC ATO AAC AAC OCG AAC OTtS OAA ACT GCC r.^ 

- «p A„ A„ ... A,, v.. IT. " 'r 

lie lie Thr Phe Gly 

GGA AAA GAG TAC AGA AGA TTC CAT G-A ACA 

^ A., ^ n r 

^'^^ Asp Arg Thr Ala 



Figure 15C<concinued) 
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oiv vii r T °" '""^ 

Gly val Lys Glu Leu His ri. ciy Val Val Gly A.p Leu Arg Tyr Asp 

GCA CCG A^ TTC ATC GAT AAT GTC AGA CTT TAT AAA AGA ACA GGA GGT ATG 
Oly Pro x:e Phe lie Asp Asn Val Ar. Leu Tyr Lys Arg Thr Gly Gly Me^ 

TGA 1991 
END 



Figure 15a(continued) 
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i ATG AAA AGA ATC GAC CTG x^^^ 

' - z: z z ™ - r ------- 

Arg Asp Asn Glu Gly Arg Phe Ser 

" "T GAA GGG ACT GTG CCA Bar . 
121 CAC CCG TAC GTT GGG ATG AAC CA» r.. 

" - - v.. - - - - - 2 T. 2 r - - 

IlB Glu Asp Arg Glu Trp He jo 

181 TAC GAG AGG GAG TTC GAG TTC A»» 

^ Val Tyr Leu Gly Ser Thr icq 

3 01 GAA GAC ATG TTC A''- G AT 

».= ::: :;; - - - - - ™ ^ 

■* ^ ^^'-^ ^i-^ ^'l--^ -ys Asn His 12C 

- ^ s s - - - - - - ^ 

Leu Glu Gin Asn Tyr Giy 140 

421 GTC CT'^ GGC 

- - z s z r "» •« - - 

3 vj./ .yr j.iQ Arg Lvs aIa m ^ 

y -ys Ala Gin Tyr Ser Tyr leo 

«t z z i: z z z z z z z z z z r - - - - 

' " Val Tyr Leu Glu leo 

S4l GTG TAC AGG GCA CGT CTT CAG ri- t^, 

Val Tyr Arg Ala Arg Leu Cln Lt ^ Tu T ^ 

-hr Ala Tyr Leu Leu Glu z-eu Glu Gly .ys ASP 200 

----^-^-2zzzzzzzz ::: 

GTA AAC GGT GAA AAG ATA GGG GAG T'T CCT GTT err 

val Asn Gly ... ,,3 ne Gly Glu pj Z Zl T T ^'^ 

" -^i/ =lu Lys Leu Phe 240 

z z z z z z z z ;:: r r - - - 

y>: P-0 Trp Asn Val Gly Lys Pro 260 
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78 X - TAC CTG TAC GAT TTC GTT Trr rrn 

ATC GOT TTG AGA AGA GTC AGA ATC G- CAG r.. . 

- - - - ... „: z z z z z z z z z ;:: 

'0. rrc At. TTC GM »rc „c cot 0T= ttc cc- „- , 

- - n. ». ... ... - - - - - - ^rr t„ 

--^^^zzzzzzzzzzzzzzz T 
-^^^^zzzzzzzzzzzzzzzz 

IKl GAA TAT CCG GAT CAT CTT CCG TGG TTC AGA AAA G- ... . 

381 Glu Tyr Pro Asp His Leu P-o p.. , '^^^ ATT IJOO 

•rp P.,e Arg .ys Lau AU Asr. Glu Glu Aia A.-g Lys He 40 = 

1201 GTG AGA AAA CTC AGA TAC CAT CCC TCC ATT GTT CTC 

-1 AX, .ys .eu A., Ty. Hie P„ ser He Va ^ 

i.^ Cys Gly Asn Asr. Glu Asn Asn 420 

,0G GGA TTC GAT GAA TGG GGA AAT ATG GCC AOA AAA GTG GAT GC^ A"-C A.^ 
«1 Trp Gly P.. Glu Trp Gly Asn Me. Ala A.g Lys V • A^p Z A T 

:i /3 va. Asp ^ly He Asn Leu Gly Asn 44 a 

■ " z z z z z z z z z z z z z z z z z z z 

--^^^zzzzzzzzzzzzzzzz 
z z zzz z z z z z z z z z z z z - - - 

^ro G.u Thr lie Glu Phe Phe Ser 520 

1561 AAA CCC GAG GAA AGA GAG ATA TTC cat rrr r^n 

- „: z z ™ :::: r 

Met Leu Lys His Asn Lys Gin Val Glu 540 
Figure 16b(continued) 
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1621 GGA CAG GAA AGA TTG ATC Aflr . 

'■'--■zzzzzzzzzzzzzz'-- 

i-ys Ala Leu Tyr Tyr lyr £20 

1861 GCO AGA AGA TTC TTC GCy ri. r^^^ ^ 

y ^YS A.g Asp Asn Lys He Glu 640 

= t, r - - - - - 

Lys A.g ser Leu Ser 01, Ala Cys Ser Leu ,so 

l.ei CCA OAA GAA 000 AOA AAA =GT ATT CCA AAA CAC -^^A CA- A.^ 

0.U C.U .y A.. Lye O.y ^ ^ ^ - - A3C ACA C=C .0.0 

- ^hr Pro Ser Arg Arg 680 



20<l TCT GAG TTT GCr TGA 205S 
S81 Cys Glu Phe Oly End 685 



Figure 16C{continued) 
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Figure Ho. l-fc-Bankia gouldi (37gp4) 

1 ATC AAA AM AAT CTA CTA ATO TTT AAA ACG CTT ACO TAT CTA CCT TTC TTT TTA ATG CTG 
1 Met .ys .,3 As„ .eu Mec Phe ..s Arg .eu Thr T.r X,e. Pro .eu Phe Uu Mec "u 

ei CTC TCA CTA AGT TCA GTA GCT CAA TCT CCT GTA GAA AAA CAT GGC CGT TTA CAA GTT GAC 
- .eu Ser .eu Ser Ser Va: AZa GXn Ser Pro va o:u oi, Ar. .eu G^ HI Z 

in GGA AAC CGC ATT CTT AAT GCG TCT GGA CAA ATT ACG AGC TTA GCT GGT AAC AGC CTC TTT 

181 TGG ACT AAT GCT GGA GAC ACC TCC GAT TTT TAT AAT GCA GAA ACT GTT GAT TTT TTA GC 
" Trp ser Asn Ala GI, A«p T.r Ser Aap Phe Tyr A.„ AXa GIu T.r Val A.p Z "u " 

'II 2 Z 7 r °" "^'^ 



SO 
20 

120 
40 

180 

Phe 6 0 

GCA 240 

8C 

300 
100 



301 GGA AAT GGC TA"^ Att* — -..^ „ 

G-y A3n Gly Tyr Ue Asp Ser Pro Gin Gl. Gin Glu Ala Lys He Arg Lys Val :le Asp 1.0 

12' Z Z Z Z T °" 

Ala Ala ne Ala Asn Gly Ue Tyr Val lie He Asp Trp His Thr His Glu Ala Glu Leu 1« 



Tyr t" r r ^ ACC AGA ATG GCA GAC CTA TAC GGA GAT ACT CCC 

Tyr Thr Aop Glu Ala Val Asp Phe Phe Thr Arg Met Ala Asp Leu Tyr Gly Asp Thr Pro 

AAT GTA ATG TAT GAA ATT TAT AAC GAG CCT ATA TAC CAA AGT TGG CCT GTT AT^ AAG AAT 
-1 As„ val Met Tyr Glu He Tyr Asn Glu Pro He Tyr Gin Ser Trp Pro Val ri; Z Z 

541 TAT GCA GAG CAA GTA ATT GCT GGT ATA CGT TCT AAA GAC CCA GAT AAT TTA ATA ATT GTA 
131 Tyr Ala Glu Cln Val lie Ala Gly He Arg Ser Lys Asp Pro Asp Asn Leu He Z "l 

"1 Gly Thr Ser Asn Tyr Ser Gin Gin Val Asp Val Ala Ser Ala Asp Pro He Ser Asp t" 

a": z r r ™ - - - - 

val Ala Tyr Thr Leu His Phe Tyr Ala Ala Phe Asn Pro His Asp Asn Leu Arg Asn 
721 GTA GCA CAG ACA GCA TTA GAT AAT AAT GTT GCT TTG TTT GTT ACA r^^ rr. . 

211 Val Ala rin Ti. v.i ni, ttt GTT ACA GAA TGG GGT ACA ATT 780 

Gin Thr Ala Leu Asp Asn Asn Val Ala Leu Phe Val Thr Glu Trp Gly Thr Ue 260 



4B0 
160 



540 
180 



600 
200 



660 
220 



720 
240 
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"31 GGA GAT GM ATT ATA ATT GCC CCT GGA AAC TAC AAT -T- r.. 

Gly Asp Glu lie He He Ala p^o G'v T Z ^° ^ 

P.o G.y Asn Tyr Asn P.e Gin Asp Lys He Gin Gly Ala 38. 

1141 T7T AAC CQf aG*^ GTT ^ 
3B1 Phe Asn ;^ Val ^ Z Z T T 

G., ser Ala Asn Gly Aan Ser Thr Asn Pro rie lie 400 

"0- TTA AGA GGC GAA AGC GCT ACA AAC CCT G- ~TC --A ~r. ^ 

^eu Arg Gly Glu Se. Ala T... Asn Pro P.; va' Z T T a"' "^'^ 

Pne Se. Gly Leu Asp Tyr Asn Asn Gly 420 

12S1 TAC CTA TTA ACT ATT GAA GGT GA" -AT TGG »»t .n-- , 
«1 Tyr Leu Leu Ser lie Glu G^v Asn r I "'^ m AAA ACT GGG 1320 

Glu G.y ASP Tyr Trp Asn He Lys Asp He Glu Phe Lys Thr Gly 440 

1321 TCT AAA GGT ATT GTT CTT GAC AAT TC- AA' GGT AG- A^. 

ser Lys Gly He Val Leu Asp Asn Se ^ Z t ™ ^ ^= =^ C.rr CAT 13SC 

AS., Gly Ser Lys Leu Lys Asn Leu Val Val His 4S0 

z z - - 2 z - z z: z z z z z z z - z z r 
--^^'^--zzzzzzzzzzz 

- z z z z z z Z Z r - - - " - - 

>.„ ..r oi„ „, Al. cy. a.. „, ,„ „. 

z z z z z r r ? ^ - - - - - ... 

T»r Al. «,„ a, ».i v.l 1,. ,„ 

Figure 17b(coiitinued) 
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ser Gly Olu Asn Ser Ser Asp jgo 

1S81 GCT TTT ATT GAT TTA AAA GGA GCC TAT GOT TT' GTi r^n 

... „. ^ - - - - „ „ 

1861 TCA ACT OCT COT AAA AAA CAA GGT TCT CCT GAA CAA A^T CAC G- rrr 

- - ^ 0. - - - - - - ^ ... 

z :z z z iz :z t r °" - - - - - 

-Hsp Fr.e Pro .le Ser Asp Gly t^^- gi,. sot, t . 

P ^-y .... u.u Asn Leu /al Asn Lvs Phe 660 

TGC CCA GAT TGG AAT ATA GAA CCA TGT AAT C- GTA C^" . .r. 

c,. ,„ ,^ - - - 

z: z z z z z z z z z z - - -«» 

-n. .eu va. v..u Gly Tyr Asn Leu Gin Val 700 
2101 GAA GTT AAT GCT ACT GAT GCA GAT GGA ACT ATT GAT AA- "TA Ai^ ^ 

- ... v.. .„ „. .^^ - - - 

21S1 AAT TTA GTT AGG CAA ATA AAT TCT ACT TCA TAT AAA TGG orr . 

- - ... ^ n. z z. z z z z z -z 

2221 ACA GAT GAA CTT AAT GGT CTT ACA GAA GGA ACT TAT ACC '-TA AA» rr- 

J - ™ ™ - - - »^ ... 

2281 AAC GAC GGG GCT TCT ACA GAA ACG CAA TTT ACG TTA ACT GTA AT. 

A3P Gly Al. Se. T.. Glu T... Gl. P.e T.. : ^ 

vax lie Thr Glu Gin Ser Pro 790 

---^zzzzzzzzzzzzzzzzz 

J«l AAG TTT TCT AAC GTT TTT GAG TTA GGA TCT GGC GGA CCA TCT TTA ACT AAT ™ 

.^i ilA AGT AAT TTA AAA ACA 2460 

Figure 17i2».(contlnued) 
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SOI- Ly, Phe ser Asn Val ph, q,, , 

— -.e.o:„ J--. CAA.AAACACAAAC 

-0 

- - V. p„ - - - - AAA CCA AAA A„ ACC CAC ^ 

^^^^^'------^^^:::s::--c::::r--- ^^^^ 

"A ACA TCA OAT AAC «r '^^P Asp Tyr Trp aao 

Asn Phe Th-^ Ti 

"-ACXAATOACCCXACTCCT. ^^^^-le T.. 

^'^e ser Asn Aap ^ «^ «T .TT AC= CCT ACT AA. 

P -hr Ala Pro Cys Asn Vai r.^r , ^ ^ ^'^0 

— Asp Asp Ser Se- ri„ ^ '--.T CCT G-'r 

A^--^ fhe Lys Leu Tyr Pro As- p. rr =^ ^"c 

A„^„,,„^^^ ASP SI. xhr ,.0 

Val Pro 



figure 17a(continued3 
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Figure NO. 16^^ Pyrococcua furiosus VC1(7EG1) 
leader sequence: amino acids 1-24 

S • ATG AGC AAG AAA AAG TT^ GTC ATC GTA TCT ATC T^^ ACA ATC C^^ TTA ar. " 
«ec se. .,3 ... .,3 V. v. Se. x.e .eu ... ^ 

" 90 OS 

CCA ATA TAT TTT GTA GAA AAG TAT CAT ACC TCT GAG GAC AAG TCA ACT TCA 2 
Ala ne ryr PHe Val GZu .,3 ryr H.s T.. Se. GIu Asp sTr sTr Z 



"7 126 



135 



ACC XCA TCT ACA CCA CCC CAA AC ACA CTT TCC ACC AAG G^ CTC AAG 

Thr se. ser T.r Pro Pro Gin Thr T.r Leu Ser r.r Thr Lys VaX .eu "s 

"5 "8 207 ,16 

AGA CCT GAT GAC GGT CAG TGG =CA GGA GCT CC. AT. CAT AAG OAT 00. GA 

Arg Ty. p.o Asp Asp =1. p,, ^^^^ ^^^^ ^^^^ 

2" 252 261 270 

OGG AAC CCA GAA TTC TAC ATT GAA ATA AAC CTA TGG AAC ATT CTT AAT GC-^ AC^ 
Cly Asn Pro Glu P.e Tyr Ue Glu He Asn .eu T.P Asn He Leu Asn 

2" 288 297 30S 33^5 

GGA TTT GCT GAG ATG ACG TAC AAT TTA ACC AGC GGC GTC CTT CAC TAC GTC ^ 
01. Phe Ala Glu Me. THr Tyr A.„ Leu T^r Ser Gly Val .eu H^s ^ Zl 2 

351 360 

CAA CTT GAC AAC ATT GTC TTG AGG GAT AGA ACT A^^ TGG CTG c" GGA TAC Zl 
cm .eu ASP Asn Xle Val Leu Arg Asp Ar^ Ser As. Trp Val His Gly Tyr Pro 

GAA ATA TTC TAT GGA IIc AAG CCA ^GG AAT GCA ^^C TAC GCA ACT GAT GGC cd 
Clu Xle P.e Tyr Gly Asn Lys Pro Trp Asn Ala Asn Tyr Ala Thr Asp Gly Pro 

ATA CCA TTA CCC ACT Z GTT TCA I^C CTA ACA 2 TTC TAT CtI ACA ATC TCC 
Xle Pro Leu Pro Ser Lys Val Ser Asn Leu T.r Asp PHe Tyr Leu T.r 7u sTr 
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S04 

TAT AM CTT GAG CCC AAG l.^r . 52^ 

.eu GZu Pro . ' ^AC ttc GCA 111 

Pro i,ys Asn oly Pro lie As„ pT * ^GG 

phe Aia He gIu ser Trp 

558 

ACG AGA OAA GCT TGG Ar» S8S 

- - «. ^'i" r - ^ °- - - ™ 

A„ ser Asp Glu g1„ 

512 

" r "° ™ - - 

^ ^ «. r r °" "° 

^ Ser Lys Val Lyg 

- :t ™ r - - - « - z - 

TGG AAG GCA AAC AT- G-T -r- 74 7 

T-^ T Al- G«T TGG GAG TAT GTT rra '56 

" «" o:-, - - ~ „= .e= « 

- -^3 Ue Lys Thr Pro Ua 

- - 5 ^" i - :z ? « - - - 

- ^ My p.. «^ 

819 628 
ATT TCA AGO TTA CCA AAT TAC ACA GM 

.eu pro Asn ^r T.r Z Z Z T '^^^ - 

^ Asp Val Glu He Gly 

" Ala His Leu Glu Tts t-^ r, 

Trp He Thr 

936 

AAC ATA ACA CTA ACT CCT C-;> p.. 



Figure 18b(concinued) 
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